An efficient lightweight network for silk fabric defect detection

Abstract

To address the challenges of balancing detection accuracy and computational efficiency in silk fabric defect detection, this paper proposes a lightweight model GSS-YOLOv8, designed to reduce parameter complexity while enabling real-time detection capabilities. A three-stage optimization strategy is adopted to target key bottlenecks. Firstly, in the backbone network, GhostHGNetV2 replaces the original feature extractor to enhance the feature representation of multiscale fabric defects while reducing the number of parameters. Secondly, a Slim-Neck structure is introduced, where the C2f module is replaced with VoVGSCSP and standard convolutions are substituted with GSConv, effectively reducing computational costs without sacrificing accuracy. Finally, a Shared Detail-Enhanced Head (SDEH) is designed. By sharing the parameters of two detail-enhanced convolutions, this module enhances the ability to capture fine-grained defect features and reduces parameter redundancy. The ablation experiments further evaluate the individual contributions of the GhostHGNetV2 backbone, the Slim-Neck paradigm design, and the proposed SDEH module, verifying the effectiveness of each improvement component. Additionally, experiments conducted on both the self-built silk fabric dataset and the Tianchi Fabric Defect Dataset confirm the feasibility and strong generalization capability of the proposed GSS-YOLOv8 model. The experimental results illustrate that compared with YOLOv8n, the GSS-YOLOv8 improves precision by 5.1 percentage points to 85.9% and mean average precision (mAP@0.5) by 2.1 percentage points to 86.5% with 80.0% recall, while reducing parameters by 51.3% to 1.46M, GFLOPs by 45.7% to 4.4G, and model size to only 3.7 MB, which fully meets the real-time detection requirements for silk fabric defects in industrial settings.

Keywords

silk fabric defect detection lightweight object detection YOLOv8 deep learning GhostHGNetV2 GSConv

1. Introduction

Fabric defects refer to flaws or imperfections during the production process, which may affect the performance or compromise the visual quality of the textile products. These defects typically manifest as color inconsistencies, surface damage, irregular shapes, or texture variations, appearing along or perpendicular to the direction of fabric movement. With advancements in spinning technology and the increasing variety and complexity of fabric types and patterns, the range of defect types has grown significantly, including issues such as holes, yarn breakage, and pattern misalignment.¹ These defects not only degrade the quality and appearance of fabrics but also lead to substantial resource waste, increased production costs, reduced market competitiveness, and significant economic losses. At present, many factories still rely on manual inspection, which is time-consuming, labor-intensive, and limited in accuracy only 60–75%, while small defects are easily overlooked. Furthermore, prolonged inspection can cause visual fatigue and negatively impact workers’ health.² Therefore, replacing manual inspection with computer vision–based fabric defect detection technologies is of great practical and economic significance. Such approaches can achieve high-speed, high-efficiency, and high-precision defect identification, and accelerating the intelligent transformation of the textile industry.³

Although traditional large-scale models offer high detection accuracy, they are often computationally intensive and resource-demanding, making them unsuitable for real-time defect detection in resource-constrained environments such as edge devices or on-site factory settings. Moreover, fabric defect detection models are typically deployed on embedded systems and edge computing platforms, where computing power and memory resources are limited. As a result, developing lightweight and easily deployable defect detection models has become a prominent research focus.

More and more deep learning-based methods are applied to inspect fabric defects. These methods are mainly categorized into two types: single-stage regression-based algorithms and two-stage algorithms that involve candidate region generation and classification. Currently, Faster R-CNN,⁴ Feature Pyramid Network (FPN),⁵ and Cascade R-CNN⁶ are considered the most effective and widely used two-stage detection algorithms in fabric defect detection. These methods typically generate a large number of region proposals using traditional image processing techniques or convolutional neural networks (CNNs), followed by classification and bounding box refinement.

However, the need to process numerous candidate regions results in high computational complexity and slow detection speed for two-stage algorithms. In contrast, single-stage algorithms perform classification and regression simultaneously using a single CNN, eliminating the proposal generation stage. Representative approaches include the Single Shot MultiBox Detector (SSD)⁷ and the You Only Look Once (YOLO) series.⁸ Compared with two-stage methods, single-stage detectors simplify the detection pipeline and achieve a better balance between speed and accuracy. Their advantages in fast inference, lightweight architecture, and deployment efficiency, particularly well-suited for real-time, high-throughput inspection scenarios on fabric production lines.

Among various object detection algorithms, YOLO stands out as a representative single-stage architecture with significant advantages in lightweight deployment. Compared with two-stage detectors, YOLO integrates feature extraction, region proposal generation, and classification/regression into a unified end-to-end network, greatly simplifying the detection pipeline and reducing model complexity. In particular, YOLOv8 further incorporates anchor-free structures, C2f modules, and lightweight feature fusion mechanisms, achieving a favorable trade-off between accuracy and speed. Despite these architectural advances, YOLOv8 still introduces relatively complex feature extraction and fusion modules, resulting in increased network depth, large parameter sizes, high computational complexity, and substantial memory usage. These factors pose challenges for practical deployment in industrial scenarios, especially in resource-constrained environments. Therefore, the real-time deployment performance of YOLOv8 remains limited in such settings and requires further optimization.

Based on the above observations, this paper proposes a GSS-YOLOv8 fabric defect detection algorithm designed to achieve fast and accurate real-time detection. The main contributions of this work are summarized as follows.

1. Replacing the original YOLOv8 backbone with GhostHGNetV2 significantly reduces the number of model parameters while maintaining strong feature extraction performance.

2. The Slim-Neck paradigm design is introduced into the neck network, where the VoVGSCSP module and GSConv are used to replace the C2f and standard convolution layers, respectively. This approach reduces the number of model parameters and computational complexity while maintaining model accuracy.

3. A shared detail-enhanced detection head is designed, which shares parameters between two detail-enhanced convolutions to enhance the ability to capture defect details while reducing the number of detection head parameters and further simplifying the network.

2. Related works

This section briefly reviews related studies on fabric defect detection and lightweight model design.

2.1. Traditional detection methods

Since the 1980s, fabric defect detection has progressed from traditional image processing techniques to advanced deep learning-based approaches. Early methods are generally classified into four categories: model-based, spectral analysis-based, statistical, and structural approaches.^9,10 Model-based methods treat fabric textures as stochastic processes and detect defects by evaluating deviations from a learned texture model,¹¹ such as autoregressive models¹² or Markov random field models.¹³ Spectral methods—such as Fourier,¹⁴ wavelet,¹⁵ and Gabor transforms¹⁶—leverage the periodic nature of fabric patterns by analyzing frequency-domain features. Statistical methods compare feature distributions between normal and defective regions, using tools like histograms, gray-level co-occurrence matrices, and morphological operations.¹⁷ Structural methods extract handcrafted features based on the repetitive primitives within fabric textures.¹⁸

Deep learning models, particularly convolutional neural networks (CNNs), have demonstrated superior performance in fabric defect detection by leveraging end-to-end training and robust feature extraction. These approaches are mainly divided into two categories: two-stage algorithms and single-stage algorithms. Chen et al. ¹⁹ embedded Gabor kernels into the Faster R-CNN framework to perform frequency-domain analysis and proposed a two-stage training strategy combining a genetic algorithm with backpropagation to optimize the model. Liu et al. ²⁰ were the first to apply the SSD model to fabric defect detection and improved its performance for small defect identification by adding a third feature layer, making the enhanced SSD more suitable for this task. Su et al. ²¹ incorporated squeeze-and-excitation (SE) modules into the FPN structure of the YOLOX network, achieving a 2.7% improvement in detection accuracy across eight fabric defect datasets while maintaining real-time performance. Si et al.²² developed a YOLOv8-based model by integrating the RepGhost module and introducing a novel information redistribution mechanism. They further employed Wasserstein distance loss to optimize localization of small objects, resulting in an 18.9% accuracy improvement on the AITEX dataset, albeit at the cost of increased model complexity.

2.2. Works on lightweight

Although the aforementioned methods have achieved significant progress in improving detection accuracy, they still face considerable challenges in lightweight deployment. Existing models generally pursue higher precision by introducing deeper and more complex network architectures, which inevitably leads to an excessive number of parameters and high computational overhead, thereby restricting their applicability to real-time scenarios on embedded or edge industrial devices. Current research on lightweight optimization has mainly focused on three mainstream detection frameworks, namely YOLO, SSD, and Faster R-CNN, while this paper concentrates on the YOLO framework to conduct lightweight exploration.

Some researchers have concentrated on improving the backbone network of YOLOv8 to achieve lightweight design and reduce computational complexity while maintaining robust feature extraction capabilities. Tie²³ reconstructs the YOLOv8n main core network, using low computing KWConv to reconstruct the network’s Bottleneck, C2f, and use BiFPN character integration method to enhance the context information on the model, effectively reducing the number of model parameters and computing requirements. Ma et al.²⁴ introduced GhostNet to replace the YOLOv8 backbone network, substituting the original Conv layers with GhostConv and the original C2f layers with C3Ghost, thereby reducing computation and inference time while maintaining the accuracy and integrity of feature representation. Liu et al.²⁵ adopted the MobileNetv3 model to replace the C3 modules and convolutional modules in the YOLOv8 backbone, enabling the model to achieve a lightweight design while maintaining high detection accuracy.

Other works have focused on optimizing the neck structure to enhance multi-scale feature fusion efficiency and improve the detection performance for small targets. Xu et al.²⁶ integrated a lightweight adaptive downsampling (ADOWN) convolution module to reduce dimensionality and achieve high processing efficiency. Ma et al.²⁷ adopted a feature fusion network structure that combines a small-target detection head with a bidirectional feature pyramid network (SBiFPN) to capture the multi-scale information of defects, enhance the model’s feature fusion capability. In addition, some studies have aimed at refining the detection head of YOLOv8 by simplifying its structure, reducing computational overhead, and further improving detection accuracy. An et al.²⁸ replaced the original Decoupled Head with a Shared Lightweight Convolutional Detection (SLCD) Head, reducing the model’s computational complexity while increasing detection accuracy. Wang et al.²⁹ replaced the original detection head with a more efficient LADH detection head and removed the head processing 32×32 feature maps, effectively reducing the model complexity and significantly improving detection accuracy.

Although existing lightweight fabric defect detection methods have made notable progress in reducing model complexity and improving detection accuracy, most have optimized only a single module, and still suffer from large parameter sizes, high computational complexity, and considerable memory consumption, which limit their application in real industrial environments. To overcome these shortcomings, this paper proposes a systematic lightweight optimization strategy applied to the backbone, neck, and detection head, aiming to reduce computational complexity and memory usage while maintaining real-time detection performance, thereby improving adaptability to various production conditions.

3. Methods

This section presents the overall architecture of the proposed fabric defect detection model, GSS-YOLOv8. Then, it provides a detailed description of the GhostHGNetV2 feature extraction backbone, the Slim-Neck feature fusion network, and the shared detail-enhanced detection head.

The YOLO series algorithms are typical representatives of single-stage approaches in the field of object detection. YOLOv8 mainly consists of four components: the input module, backbone network, neck network, and detection head. The backbone is composed of convolutional layers, C2f modules, and the SPPF module. The C2f module is an optimized version of the C3 structure in YOLOv5, featuring double convolution and multi-branch connections to enhance gradient flow. Inspired by the ELAN module in YOLOv7, parallel branches are introduced to improve feature representation capability. The SPPF module expands the receptive field through multi-scale pooling. The neck adopts a PAN-FPN structure to achieve multi-scale feature fusion. The head employs a decoupled structure with an anchor-free mechanism and incorporates Distribution Focal Loss and Task Aligned Assigner to optimize training and improve detection performance. However, YOLOv8 introduces a complex feature extraction and fusion structure, leading to increased network depth, a large number of parameters, high computational complexity, and significant memory consumption, which impose certain limitations in practical industrial applications.

Therefore, this paper proposes the GSS-YOLOv8 model to achieve lightweight detection of fabric defects. Ghost convolution is introduced to optimize the HGBlock, resulting in the creation of Ghost_HGBlock. GhostHGNetV2 replaces the original backbone network and serves as the model’s feature extraction network. This enhances the model’s ability to independently learn cross-channel features and further capture local features of defect images, thereby improving its representation capability for fabric defect characteristics while effectively reducing the number of parameters in the backbone. Secondly, in the feature fusion stage, lightweight GSConv is employed to replace the original convolutional kernels, and the Slim-Neck feature fusion network is introduced. This enables each feature layer to simultaneously consider the semantic information of deep features and the fine-grained details of shallow features. The model is thus simplified and computational complexity reduced, while maintaining detection accuracy. Finally, a shared detail-enhanced detection head (SDEH) is designed. By sharing convolutional parameters, the convolutional computation of multi-scale feature maps during detection is reduced, which decreases the number of parameters in the detection head and further simplifies the overall network, making it more suitable for deployment. The architecture of the improved GSS-YOLOv8 model is shown in Figure 1.

Figure 1.

The structure diagram of the GSS-YOLOv8 algorithm.

3.1. GhostHGNetV2 feature extraction network

The original YOLOv8 backbone adopts a deep convolutional structure with many standard convolutions and C2f modules, resulting in relatively high computational cost and parameter redundancy. As the network depth increases, the computational burden becomes heavier, which limits its suitability for real-time inference and edge deployment. To address this issue, GhostHGNetV2 is introduced as the backbone of the proposed model. Based on the HGNet architecture in RT-DETR,³⁰ GhostHGNetV2 employs Ghost convolution³¹ to optimize the convolutional layers in HGBlock and constructs the proposed Ghost_HGBlock. In addition, standard convolutions and C2f modules in the original YOLOv8 backbone are replaced with Ghost_HGBlock and depthwise separable convolution, thereby reducing model complexity while preserving effective feature extraction capability. As shown in Figure 1, GhostHGNetV2 mainly consists of HGStem, Ghost_HGBlock, depthwise separable convolutions, and the SPPF module.

HGStem serves as the initial preprocessing layer of the network, performing preliminary feature extraction from the input image using convolution operations. It applies max pooling to downsample and reduce dimensionality, enabling the capture of multi-scale features. The workflow of the HGStem module is shown in Figure 2(a).

Figure 2.

Flow diagram of HGStem and HGBlock.

The specific process is expressed by the following equation:

Y_{1} = M a x P o o l_{2 \times 2}^{1} [F_{3 \times 3}^{2} (X)]

(1)

Y_{2} = F_{2 \times 2}^{1} {F_{2 \times 2}^{1} [F_{3 \times 3}^{2} (X)]}

(2)

Y = F_{1 \times 1}^{1} {F_{3 \times 3}^{2} [C o n c a t (Y_{1}, Y_{2})]}

(3)

Here,

X

denotes the input image and

Y

the output feature map.

F_{3 \times 3}^{2}

F_{2 \times 2}^{1}

, and

F_{1 \times 1}^{1}

represent ordinary convolution operations whose kernel sizes are 3, 2, and 1, respectively, and whose strides are 2, 1, and 1, respectively. Is a 2×2-window max-pooling operation with stride 1. The HGStem preprocessing module not only enriches the features, but also enhances the model’s ability to express features effectively by fusing them with the convolutional feature maps.

HGBlock is used for feature extraction, as shown in Figure 2(b). It employs convolutional layers of different sizes to capture multi-scale features, followed by channel compression and feature reweighting. Through this process, the module enhances important features and improves the representation of hierarchical and contextual information.

Ghost convolution uses inexpensive linear transformations to generate additional “Ghost” feature maps from intrinsic features, thereby enriching feature representation at low cost, as shown in Figure 3. Specifically, a standard 1×1 convolution is first applied to compress the channels and produce the intrinsic feature maps. Then, a series of simple linear operations Cn, implemented by depthwise separable convolutions, are used to generate additional Ghost feature maps. Finally, the intrinsic and Ghost feature maps are concatenated to form the output feature maps. In this way, Ghost convolution effectively reduces the computational cost of conventional convolution while accelerating model inference.

Figure 3.

The structure diagram of Ghost convolution.

The process framework of Ghost_HGBlock is shown in Figure 4. The input features are fused using multiple Ghost convolutions followed by a Concat operation. Then, two standard 1×1 convolutions are applied to adjust the size of the output feature maps, extract local information, and improve the detection accuracy for tiny and sparse fabric defects. The specific implementation is described by the following formula:

\begin{array}{l} Y_{1} = G (X), \\ Y_{2} = G (Y_{1}), \\ \cdot \cdot \cdot \\ Y_{n} = G (Y_{n - 1}) \end{array}

(4)

Y = F_{1 \times 1}^{1} {F_{1 \times 1}^{1} [C o n c a t (Y_{1}, Y_{2}, \cdot \cdot \cdot, Y_{n})]}

(5)

In the formula,

G

represents the Ghost convolution operation mentioned above.

Figure 4.

The structure diagram of Ghost_HGBlock.

Depthwise separable convolution is a combination of depthwise convolution and pointwise convolution, as illustrated in Figure 5. In pointwise convolution, each kernel performs a weighted combination of the feature maps produced by the depthwise convolution along the channel dimension, generating new feature maps. Each kernel corresponds to one output channel, enabling both cross-channel feature fusion and adjustment of the number of output channels.

Figure 5.

The structure diagram of the depth-separable convolution.

The parametric quantities of the standard convolution and the depth-separable convolution are calculated separately by the following equations:

C o n v = c \times m \times k \times k

(6)

D W C o n v = c \times k \times k

(7)

In the equation,

c

represents the number of input feature map channels,

k \times k

denotes the kernel size, and

m

is the number of output feature map channels. The parameter ratio between depthwise separable convolution and standard convolution is as follows:

\frac{D W C o n v}{C o n v} = \frac{c \times k \times k}{c \times m \times k \times k} = \frac{1}{m}

(8)

As can be seen from Eq. (8), the parameters of the depth-separable convolution are reduced by a factor of (m-1) relative to the standard convolution. The depth separable convolution substantially reduces the model parameters and computational complexity while maintaining high accuracy.

3.2 Slim-Neck feature fusion network

YOLOv8 uses many standard convolutions in its neck network for feature fusion, which increases inference time and computational cost. To address this issue, the Slim-Neck design is introduced into the neck of YOLOv8. Slim-Neck is a lightweight feature fusion network mainly composed of GSConv, GS Bottleneck, and VoVGSCSP, which can be flexibly combined to construct an efficient neck architecture. In this design, GSConv is used to replace standard convolutions. As shown in Figure 6, GSConv combines standard convolution, depthwise separable convolution, and channel shuffle operations. It concatenates features generated by standard and depthwise separable convolutions, and then uses channel shuffle to enhance cross-channel information interaction, enabling the output of depthwise separable convolution to better approximate that of standard convolution. The computational cost of GSConv is about 50% of that of standard convolution, while maintaining comparable feature learning ability.

Figure 6.

The structure diagram of GSConv:The blue labeled “Conv” denotes standard convolution; the pink labeled “DWConv” denotes depth separable convolution.

Secondly, based on GSConv, Slim-Neck further incorporates the GS Bottleneck module to further reduce the model’s computational load, as illustrated in Figure 7(a). In addition, a one-time aggregation strategy is adopted to design an efficient cross-stage partial (CSP) module, VoVGSCSP, which aims to reduce computational complexity and inference time while maintaining accuracy. The structure of VoVGSCSP is shown in Figure 7(b).

Figure 7.

The structure diagram of GS bottleneck and VoVGSCSP.

The architecture of the complete Slim-Neck feature fusion network is shown in Figure 8. In this structure, the VoVGSCSP module and GSConv module replace the C2f and standard convolution in the original YOLOv8 neck, respectively, while retaining the original FPN-PAN architecture unchanged.

Figure 8.

Comparison of original YOLOv8 neck structure and Slim-Neck structure.

3.3 SDEH

The original detection head of YOLOv8 performs detection on feature maps of three different scales (P3, P4, and P5), each of which requires an independent detection head that performs separate convolutional computations to adjust channels and extract features. As a result, the detection head includes multiple standard convolutional layers, leading to an increase in the number of model parameters and high computational overhead. To address these issues, this paper proposes a shared detail-enhanced detection head (SDEH), which aims to reduce the amount of convolutional computation across different-scale feature maps during detection by sharing convolutional parameters. The SDEH consists of the following three main strategies.

The Batch Normalization (BN) method in convolution is replaced by Group Normalization (GN) ³² to solve the problem of BN increasing the explicit memory due to storing the mean and variance.The batch size of BN affects the model error, which is limited by memory consumption, and GN does not need to store additional parameters, which can be used to inference in small batches of data to maintain stable performance and reduce computational overhead, which is more suitable for deployment on lightweight devices. In addition, due to the characteristics of fabric defects with tiny and high similarity to the background, silk fabric images tend to have high resolution, while GN avoids the feature distortion problem of BN normalization at high resolution, is more robust to high resolution detection, and enhances the model’s adaptability to multi-scale features. A comparison of the BN and GN normalization methods is shown in Figure 9.

Figure 9.

Normalization methods for BN and GN.

GN groups the channel directions and then normalization is done within each group.The GN feature normalization method performs the following calculations:

{\hat{x}}_{i} = \frac{1}{σ_{i}} (x_{i} - μ_{i})

(9)

In the equation,

x

represents the feature computed by the layer,

i

is the index, and

μ

and

σ

denote the mean and standard deviation calculated by the following formulas.

μ_{i} = \frac{1}{m} \sum_{k \in S_{i}} x_{k}

(10)

σ_{i} = \sqrt{\frac{1}{m} \sum_{k \in S_{i}} {(x_{k} - μ_{i})}^{2} + ò}

(11)

In GN, the set

S_{i}

is defined as:

S_{i} = {k ∣ k_{N} = i_{N}, ⌊ \frac{k_{C}}{C / G} ⌋ = ⌊ \frac{i_{C}}{C / G} ⌋}

(12)

In the equation,

G

represents the number of groups,

C / G

denotes the number of channels per group, and

⌊ \cdot ⌋

indicates the floor operation. The pixels in the

C / G

channels are independently normalized using their corresponding parameters.

Detail-Enhanced Convolution (DEConv)³³ is introduced to construct a shared convolutional layer, which integrates the semantic and contextual information contained in the feature maps captured by the three detection heads at different scales, thereby generating prediction boxes and classification targets. By sharing the weight parameters of the convolutional layers, the model’s efficiency and accuracy are improved, while memory consumption and redundant computational overhead are reduced. Detail-Enhanced Convolution (DEConv) integrates prior information into standard convolutional layers. It combines the learned features from five parallel convolutional branches to generate the final output, thereby enhancing the model’s representational and generalization capabilities. Its structural diagram is shown in Figure 10. By leveraging re-parameterization techniques, DEConv is able to extract richer features without increasing the number of parameters, and without adding additional computational or memory burden during the inference stage. The specific formula for the re-parameterized reconstruction of the layer is as follows:

F_{out} = D E Conv (F_{in}) = \sum_{i = 1}^{5} F_{in} * K_{i} = F_{in} * (\sum_{i = 1}^{5} K_{i}) = F_{in} * K_{cvt}

(13)

In which,

F_{in}

represents the input features;

F_{out}

indicates the output inference time and computational cost;

K_{i}

denotes the kernels of VC, CDC, ADC, HDC, and VDC respectively;

*

represents the convolution operations; and

K_{cvt}

refers to the re-parameterized kernel that combines the parallel convolutions. The inference time and computational cost of the Detail-Enhanced Convolution are the same as those of the regular convolution.

Figure 10.

The structure diagram of the enhanced detail convolution:VC, CDC, ADC, VDC, and HDC are the five parallel convolutional layers in the enhanced detail convolution.

When using a shared convolutional layer, the target scales detected by each detection head may be inconsistent. To address this issue, a Scale layer is introduced to rescale the features. By adjusting the scale of the features, it helps improve the stability of model training. The structure of the shared detail-enhanced detection head (SDEH) is illustrated in Figure 11.

Figure 11.

The structure diagram of the SDEH.

The SDEH module receives three feature maps of different scales (P3, P4, and P5) from the neck network. To integrate the feature information effectively, three 1×1 convolutional layers with Group Normalization are applied respectively to adjust the number of channels for each feature map. Then, two weight-shared 3×3 Detail-Enhanced Convolutions are used to enhance the model’s representational capacity, aggregating rich contextual and multi-scale information while reducing the overall number of parameters. Finally, a 1×1 standard convolution is employed to decouple the computation of classification and regression losses. In addition, a Scale layer is added after each regression branch to dynamically adjust the target scale, thereby addressing the discrepancies in target sizes handled by different detection heads.

4 Experimental results and analysis

4.1 Dataset

The experimental images for this dataset were collected from Guangde Xinfeng Silk Co. Ltd. and the Shengzhou Innovation Research Institute of Zhejiang Sci-Tech University, with defect images captured using high-resolution mobile devices. The silk fabrics primarily consist of Songjin, a brocade-like fabric made with pure mulberry silk or mulberry silk warp. The dataset contains nine types of silk fabric defects: Broken thread, Grease stain, Cracked tangle, Tight warp, Bar, Thick streak, Filament turndown, Sloughed-off weft and Missing end, comprising 677 high-resolution images (3000×3000 pixels), as shown in Figure 12.

Figure 12.

Image of defects in silk fabric.

Table 1 presents the class distribution of the original self-built silk fabric defect dataset. It can be observed that the numbers of samples vary across different defect categories, indicating a certain degree of class imbalance in the original dataset. Meanwhile, some defect categories contain relatively few samples, which is not conducive to sufficiently learning the feature representations of different defects. Therefore, to enlarge the dataset scale, alleviate sample insufficiency, and improve data diversity as well as model generalization, offline data augmentation was further applied to the original dataset. Specifically, six augmentation methods were adopted, including horizontal flipping, contrast adjustment, Gaussian blurring, random cropping, luminance enhancement, and affine transformation. After augmentation, the total number of images was expanded to 2,708, and the dataset was divided into training, validation, and test sets at a ratio of 80%, 10%, and 10%.

Table 1.

Class distribution of the self-built silk fabric defect dataset.

Category	DuanSi	YouWu	Cao	JiJing	Dang	Cu	MaoSi	Yu	QueJing
Number	53	28	243	43	85	64	86	55	20

4.2. Experimental details

The processor is Intel(R) Core(TM) i9-14900KF CPU; the RAM is 8 GB; the graphics card model is Gigabyte RTX 4060 with 8 GB of video memory; the hard disk capacity is 1 TB SSD; the operating system is Windows 10 (64-bit); the version of Compute Unified Device Architecture (CUDA) is 11.8; the deep learning framework platform is Pytorch 2.4.1, Python version 3.8.19, and the experimental parameters are shown in Table 2.

Table 2.

Experimental parameters.

Parameters	Value
Optimizer	SGD
Batch size	16
Epochs	500
Learning rate	0.01
Momentum	0.937
patience	25

4.3. Evaluation indicators

In this paper, the evaluation metrics for model performance are precision (P), recall (R), mean Average Precision at IoU threshold 0.5 (mAP@0.5), floating-point operations (GFLOPs), FPS, and the formulas for precision and recall are as follows:

P = \frac{T P}{T P + F P}

(14)

R = \frac{T P}{T P + F N}

(15)

where

T P

denotes the number of correctly detected targets,

F P

represents the number of falsely detected targets, and

F N

refers to the number of missed detections. The calculation formula for mAP@0.5 is as follows:

mAP = \frac{\sum_{i l}^{N} A P_{i}}{N} A P = \int_{0}^{1} P (R) d R

(16)

mAP @ 0.5 = \frac{1}{C} \sum_{i = 1}^{C} A P @ 0 . 5_{i} m A P = \frac{\sum_{i l}^{N} A P_{i}}{N} A P = \int_{0}^{1} P (R) d R

(17)

where

N

denotes the number of detection categories, and

A P

represents the average precision for each category. mAP@0.5 refers to the mean Average Precision at an IoU threshold of 0.5.

C

is the total number of defect categories. The calculation formula for GFLOPs is as follows:

G F L O P s = \frac{T o t a l F l o a t i n g P o i n t O p e r a t i o n s}{C o m p u t a t i o n T i m e (s e c o n d s) \times 10^{9}}

(18)

The calculation formula for FPS is as follows:

FPS = \frac{1000}{(Tpre + Tinfer + TNMS)}

(19)

where

Tpre

corresponds to the time consumed in image pre-processing,

Tinfer

to the inference speed, and

TNMS

to the time taken for post-processing. A higher

F P S

reflects greater inference efficiency.

4.4. Ablation experiment

To verify the effectiveness of the three improvements based on YOLOv8n and evaluate the rationality of the overall network architecture, we introduced GhostHGNetV2, Slim-Neck, and SDEH into the YOLOv8n framework in the form of single-module, dual-module, and triple-module combinations, as shown in Table 3. The performance of each upgraded network configuration was subsequently tested.

Table 3.

Results of ablation experiments: A stands for GhostHGNetV2 network; B stands for Slim-Neck feature convergence network; C stands for SDEH.

Methods	Parameters	GFLOPs (G)	Model size (MB)	P (%)	R (%)	mAP@0.5 (%)
YOLOv8n	3,007,403	8.1	6.3	80.8	79.9	84.4
YOLOv8n+A	2,310,463	6.8	5.0	86.0	75.8	84.4
YOLOv8n+B	2,797,419	7.3	6.0	83.3	83.2	86.6
YOLOv8n+C	2,363,036	6.5	5.4	80.6	74.3	82.0
YOLOv8n+A+B	2,100,479	6.0	4.6	88.1	81.3	86.7
YOLOv8n+A+C	1,666,096	5.3	4.0	74.3	76.7	79.2
YOLOv8n+B+C	2,153,052	5.7	5.0	82.7	79.6	84.9
YOLOv8n+A+B+C	1,456,112	4.4	3.7	85.9	80.0	86.5

Bold values indicate the best results for the corresponding metrics.

As shown in the ablation results in Table 3 and the dual Y-axis bar-line chart of parameter count and GFLOPs in Figure 13, replacing the original YOLOv8n backbone with the A (GhostHGNetV2) structure resulted in a 23% reduction in the number of parameters and a 16% decrease in computational cost. This improvement is attributed to the design of GhostHGNetV2, which integrates Ghost convolution with the lightweight HGNet architecture. Specifically, the use of depthwise separable convolutions avoids cross-channel computations, while the Ghost_HGBlock module reduces the number of channels, enabling feature extraction with fewer parameters and effectively minimizing computational redundancy. In addition, GhostHGNetV2 leverages the linear operations of the Ghost module to capture richer feature representations. Despite the reduction in model complexity, the improved network maintains an mAP@0.5 of 84.4%, with a 5.2% increase in precision, indicating that GhostHGNetV2 achieves strong detection performance while significantly improving computational efficiency. However, since the optimization primarily targets the backbone, the model exhibits limitations in identifying certain complex and subtle silk fabric defects, resulting in a 4.1% drop in recall.

Figure 13.

Different number of modular parameters and GFLOPs.

By introducing the B (Slim-Neck paradigm) design after the backbone network, feature fusion is optimized, enabling high-level features to be transmitted more effectively to the detection head while simultaneously reducing computational complexity. Compared with the baseline model, this improvement leads to a 2.2% increase in mAP@0.5 and a 3.3% increase in recall, significantly enhancing the model’s ability to detect silk fabric defects.

The proposed C (SDEH) is applied to optimize the detection head, reducing the number of parameters by 21.3% through parameter sharing in convolutional layers, and decreasing the computational cost by 19.8%. Meanwhile, the use of DEconv in the detection head enhances the model’s ability to capture fine-grained defect details. However, a slight decline in detection accuracy is observed, which may be attributed to the limited adaptability of the detection head to defects with relatively large annotation regions, such as Grease stain and Cracked tangle, that are present in the silk fabric dataset.

The A+B combination achieves better overall detection performance, indicating a strong complementarity between the two modules. Module A mainly operates at the backbone feature extraction stage, improving cross-channel feature modeling efficiency and local defect representation capability. However, its lightweight compression may also weaken part of the fine-grained information, thereby reducing the model’s feature representation capability and leading to a decrease in recall. Module B mainly operates at the neck feature fusion stage and can more effectively fuse shallow texture details with deep semantic information, thus compensating for the detail loss caused by feature compression in A. For silk fabric defects, which are typically small in scale, weak in texture, and highly similar to the background, module A provides more efficient feature extraction, while module B enhances cross-level feature transmission and fusion. Therefore, the combination of the two achieves better overall detection performance while maintaining lightweight characteristics, demonstrating a strong synergistic effect.

The A+C combination leads to a decline in detection performance, suggesting that these two modules do not form effective synergy under the current task setting. Module A mainly operates at the backbone feature extraction stage. Although it improves feature extraction efficiency and reduces model complexity, its lightweight compression may also weaken part of the fine-grained information. Module C mainly operates at the detection head stage. By sharing convolutional parameters, it reduces redundant computation, but at the same time it may limit the independent adaptability of different-scale detection branches to some extent. For silk fabric defects, which are typically small in scale, weak in texture, and highly similar to the background, the combination of A and C is more likely to cause insufficient detail representation and inadequate scale adaptation when no feature-fusion compensation from module B is provided. As a result, precision, recall, and mAP@0.5 all decrease.

The detection performance of the B+C combination is generally comparable to that of the baseline. This is because module B improves feature representation by enhancing the fusion of shallow details and deep semantic information, while module C, although reducing redundant computation in the detection head, also imposes some constraints on the independent adaptability of different-scale detection branches. As a result, the performance gain brought by B partially compensates for the limitation introduced by C, allowing this combination to maintain detection performance close to the baseline while remaining lightweight.

Experimental results show that the GSS-YOLOv8 model with the A+B+C configuration achieves strong overall performance while reducing computational complexity, parameter count, and model size. Compared with the baseline YOLOv8n, GSS-YOLOv8 improves precision by 5.1% to 85.9%, while maintaining a recall of 80.0%. mAP@0.5 increases from 84.4% to 86.5%, corresponding to a gain of 2.1%. In addition, the number of parameters is reduced by 51.3% to 1,456,112, the computational cost decreases by 45.7% to 4.4 GFLOPs, and the model size is further reduced to 3.7 MB. Although the A+B combination achieves better detection performance, the A+B+C configuration further reduces parameter count, GFLOPs, and model size while maintaining competitive accuracy. Therefore, from an overall perspective, GSS-YOLOv8 achieves a better balance between detection performance and lightweight deployment requirements, making it suitable for industrial silk fabric defect detection applications.

4.5. Comparison with YOLOv8 performance

To evaluate the effectiveness of the lightweighting strategy adopted in GSS-YOLOv8, we conduct a visual comparison of the number of parameters and computational cost between each module of GSS-YOLOv8 and the original YOLOv8 model. As shown in Figure 14, the comparison covers the backbone, neck, detection head, and the overall network architecture, clearly illustrating the reduction in both parameter count and computational complexity achieved by the proposed lightweight design.

Figure 14.

Comparison of parameter counts and GFLOPs of each model component.

From the perspective of parameter quantity, GSS-YOLOv8 exhibits significantly fewer parameters across all parts of the network compared to YOLOv8n, indicating the high effectiveness of the proposed lightweighting strategy in reducing model complexity. Specifically, replacing the original backbone with GhostHGNetV2 for feature extraction reduces the parameter count in this component by approximately 56%. In the neck structure, substituting the original C2f and standard convolution with the VoVGSCSP and GSConv modules further reduces parameters by about 42%. Additionally, the proposed shared detail-enhanced detection head (SDEH) reduces the parameter count at the detection head by approximately 59%. Overall, the total number of parameters in GSS-YOLOv8 is reduced from 3.26 million in YOLOv8n to 1.66 million, achieving a reduction of roughly 51%. This substantial decrease in parameter count effectively lowers the model’s storage requirements, contributing to a more lightweight architecture.

From the perspective of computational cost, GSS-YOLOv8 exhibits lower complexity across all modules compared to YOLOv8n, further validating the effectiveness of the proposed lightweighting strategy. Specifically, the computational cost of the backbone and neck modules is reduced by approximately 42% and 36.5%, respectively, significantly decreasing overall resource consumption. Notably, the SDEH reduces the computation of the detection head by 90%, with GFLOPs dropping from 2.99 to 0.17, thereby greatly improving inference efficiency. Overall, the total computational cost of GSS-YOLOv8 is reduced by around 60% compared to YOLOv8n, significantly optimizing computational complexity and making the model more suitable for low-power devices, edge computing, and embedded applications.

In summary, GSS-YOLOv8 achieves dual compression of parameters and computational cost by optimizing the backbone, neck, and detection head. Compared to YOLOv8n, it reduces the number of parameters by 50% and the computational cost by 60%, significantly enhancing the model’s lightweight characteristics and improving its applicability in resource-constrained environments.

Figure 15 presents a comparison of the defect detection visualization results between the improved algorithm and the original YOLOv8 on the silk testing dataset. It presents eight groups of images (a–h), covering the detection results for all nine types of silk fabric defects. Each image contains one or more defect types. As illustrated, the improved algorithm demonstrates superior detection accuracy compared to the original YOLOv8. For certain defects with extreme aspect ratios, which are often characterized by loose structures or a high degree of similarity to the background, the prediction boxes generated by YOLOv8 fail to accurately enclose the target regions. In contrast, the proposed GSS-YOLOv8 model generates prediction boxes that more closely align with the ground truth annotations. For instance, in group (a), the “Duan Si” defect is more completely covered by the prediction box of GSS-YOLOv8, indicating that the proposed enhancements improve the model’s localization capability and recall. Furthermore, the original YOLOv8 exhibits a higher tendency for false positives when detecting small-target defects. In group (e), for example, fabric textures are erroneously detected as “You Wu” defects. These false detections are significantly reduced in the results generated by GSS-YOLOv8, demonstrating its stronger discriminative ability in distinguishing true defects from normal textures.In addition, the confidence scores predicted by GSS-YOLOv8 are generally higher and more stable. For example, in group (g), the confidence score for the “Que Jing” defect increases from 0.7 (YOLOv8) to 0.9 (GSS-YOLOv8), indicating more reliable defect classification and stronger robustness in category recognition.

Figure 15.

Comparison image of defect detection results.

4.6 Comparison experiment

To verify the validity of the model, the proposed model is compared with mainstream lightweight models on the silk fabric dataset, including SSD, YOLOv3-tiny, YOLOv5n, YOLOv7-tiny, YOLOv8n, YOLOv9-tiny, YOLOv10n and YOLOv11n. As shown in Table 4, GSS-YOLOv8 achieves the best performance across five evaluation metrics: number of parameters, number of FLOPs, memory size, recall, and mAP@0.5, demonstrating an optimal balance between computational complexity and detection accuracy. In terms of model lightweighting, the substantial reduction in both parameter count and computational cost significantly enhances the model’s practicality for resource-constrained environments such as edge devices, mobile inference, and low-power applications. Regarding detection accuracy, the proposed model achieves a precision of 85.9%, a recall of 80.0%, and an mAP@0.5 of 86.5%, indicating that it maintains excellent object detection performance while being lightweight.

Table 4.

Comparative experimental results.

Methods	Parameters	GFLOPs (G)	Model size (MB)	FPS	P (%)	R (%)	mAP@0.5 (%)
YOLOv3-tiny	12,132,290	18.9	24.4	632.1	52.0	60.5	57.8
YOLOv5n	2,504,699	7.1	5.3	364.6	69.1	66.1	73.0
YOLOv7-tiny	6,029,244	13.1	12.3	95.2	78.3	77.6	80.1
YOLOv8n	3,007,403	8.1	6.3	374.7	80.8	79.9	84.4
YOLOv9-tiny	1,972,539	7.6	4.7	154.2	77.0	71.9	76.6
YOLOv10n	2,266,923	6.5	6.3	300.4	64.2	60.0	68.7
YOLOv11n	2,583,907	6.3	5.5	293.6	71.4	71.4	75.7
SSD	24,148,729	31.4	94.6	198.4	87.8	33.2	71.3
proposed	1,456,112	4.4	3.7	236.2	85.9	80.0	86.5

Bold values indicate the best results for the corresponding metrics.

Although YOLOv8n demonstrates strong detection performance among mainstream lightweight models, its parameter count and computational complexity remain relatively high. YOLOv11n, the latest algorithm proposed by the authors of YOLOv8, reduces the number of parameters and computational cost by 14.1% and 22.2%, respectively, compared to YOLOv8n, achieving a balanced between lightweight design and detection accuracy. However, in terms of overall performance, GSS-YOLOv8 outperforms both by adopting a more efficient lightweight architecture. It significantly reduces model parameters and computational complexity while maintaining high detection accuracy, thereby achieving superior inference efficiency.

YOLOv3-tiny achieves the highest inference speed with 632.1 FPS. However, its accuracy in silk fabric defect detection is relatively poor. YOLOv8n demonstrates excellent detection speed while maintaining higher detection accuracy. The proposed GSS-YOLOv8 model reaches 236.2 FPS on the silk fabric dataset. Although GSS-YOLOv8 achieves clear reductions in parameter count and GFLOPs, its inference speed is lower than that of the original YOLOv8n. This indicates that reductions in theoretical model complexity do not necessarily translate into higher practical inference speed. A possible reason is that the proposed network introduces several lightweight but less hardware-friendly operations, including Ghost-based feature generation, channel shuffle, multi-branch feature processing, and frequent feature concatenation. While these operations are effective in reducing parameters and arithmetic cost, they may also increase memory access, tensor rearrangement, and runtime scheduling overhead on the GPU. As a result, part of the theoretical computational advantage may be offset during actual execution, leading to a lower FPS. Therefore, the lightweight advantage of GSS-YOLOv8 is mainly reflected in parameter count, GFLOPs, and model size, whereas its inference speed still leaves room for further improvement.

Figure 16 presents a comparison of the mAP@0.5 values during the training process of the proposed algorithm and several baseline models on the silk fabric dataset. With the number of epochs set to 500, most models converge and stabilize as training progresses. The YOLOv3-tiny model, in particular, employs early stopping at epoch 270. Among the models, YOLOv8n (purple curve), YOLOv7-tiny (red curve), and GSS-YOLOv8 (yellow curve) achieve the highest mAP@0.5 scores on silk fabric defect images, making them suitable for target detection tasks with high precision requirements. Notably, the GSS-YOLOv8 model demonstrates the best overall performance, indicating that the proposed algorithm achieves superior detection accuracy and robustness in fabric defect detection tasks.

Figure 16.

mAP@0.5 Variation curves of different algorithms.

4.7. Validation on a public dataset

To further evaluate the feasibility and generalization capability of GSS-YOLOv8, validation experiments were conducted on the publicly available Tianchi fabric defect dataset. Figure 17 presents the qualitative detection results on weft shrinkage and warp breakage defects. It can be observed that YOLOv8n, YOLOv10n, and GSS-YOLOv8 all generate prediction boxes with relatively accurate localization and shape alignment, while GSS-YOLOv8 shows higher-confidence predictions and no missed detections in the selected samples. These qualitative results indicate that GSS-YOLOv8 can maintain stable and effective detection performance under complex defect scenarios.

Figure 17.

Detection results of different algorithms on the Tianchi fabric images:(a) original image; (b) SSD; (c) YOLOv3-tiny; (d) YOLOv5n; (e) YOLOv7-tiny; (f) YOLOv8n; (g) YOLOv9-tiny; (h) YOLOv10n; (i) YOLOv11n; (j) GSS-YOLOv8.

Table 5 further reports the quantitative comparison of GSS-YOLOv8 with other mainstream detection models and recent YOLOv8-based lightweight variants on the Tianchi fabric defect dataset. Compared with the baseline YOLOv8n, the proposed method significantly reduces the number of parameters, GFLOPs, and model size, while maintaining competitive detection accuracy. Compared with the two added recent YOLOv8 lightweight variants, namely Improved YOLOv8 and LWFDD-YOLO, GSS-YOLOv8 also shows stronger overall detection performance while preserving clear advantages in model compactness and computational efficiency. These results demonstrate that the proposed method achieves a favorable balance between detection performance and lightweight design.

Table 5.

Comparison of detection results of different algorithms on the Tianchi fabric dataset.

Methods	Parameters	GFLOPs (G)	Model size (MB)	P (%)	R (%)	mAP@0.5 (%)
YOLOv3-tiny	12,138,458	18.9	24.5	79.3	52.7	61.4
YOLOv5n	2,507,039	7.1	5.4	74.9	62.2	69.5
YOLOv7-tiny	6,061,716	13.2	12.4	80.0	52.8	64.0
YOLOv8n	3,009,743	8.1	6.4	80.9	67.8	75.8
YOLOv9-tiny	1,974,879	7.6	4.8	76.1	65.1	71.2
YOLOv10n	2,269,263	6.5	5.9	72.7	62.5	69.4
YOLOv11n	2,586,247	6.3	5.6	81.3	68.5	74.4
SSD	24,149,526	31.4	95.1	70.8	18.0	55.0
Improved YOLOv8³⁴	2,915,341	7.9	——	53.4	45.9	44.2
LWFDD-YOLO³⁵	2,305,878	6.1	——	84.6	26.3	56.2
proposed	1,456,892	4.5	3.7	80.4	71.5	75.7

5. Conclusion

Aiming at the problems of limited storage resources, and limited application scenarios in fabric industrial inspection, this paper proposes a novel model named GSS-YOLOv8. The proposed model achieves a superior balance between detection accuracy and computational cost. Specifically, the GhostHGNetV2 feature extraction network and the Slim-Neck feature fusion network are employed to replace the original backbone and neck structures of YOLOv8n, respectively. These two modules effectively reduce the number of model parameters without compromising detection performance. Furthermore, a SDEH is designed to further optimize the model architecture and improve inference efficiency.

The experimental results show that the number of parameters and the computational amount of GSS-YOLOv8 are substantially reduced by 51.3% and 45.7%, respectively, compared with the original YOLOv8n model, while the detection accuracy is better, with the accuracy, recall, and mAP@0.5 values of 85.9%, 80.0%, and 86.5%, respectively. This shows that GSS-YOLOv8 optimizes the complexity of the model while maintaining the detection accuracy. In addition, the feasibility of the model on the public dataset of Tianchi Fabric is verified, and the detection mAP@0.5 value reaches 75.7%, which is better than other lightweighting algorithms.

Despite the promising results, this study still has some limitations. Although GSS-YOLOv8 achieves clear reductions in parameter count, GFLOPs, and model size, its practical inference speed is still lower than that of the original YOLOv8n, indicating that reductions in theoretical model complexity do not necessarily translate into higher runtime efficiency. In addition, the actual runtime memory consumption and deployment performance of the proposed model on specific edge devices have not yet been systematically evaluated. Furthermore, the synergy and constraint mechanisms among the lightweight modules are mainly discussed from a qualitative perspective in the current work. In future research, we will further investigate hardware-friendly lightweight optimization, conduct deployment experiments on edge platforms, and perform more detailed profiling and analysis of runtime efficiency, memory usage, and module interaction mechanisms.

Footnotes

ORCID iD

Ying Wu

Consent to participate

Informed consent was obtained from all individual participants included in the study. Participants signed informed consent regarding publishing their data.

Author contributions

Jianye Wang conceived the study, performed the majority of the experiments, and drafted the manuscript. Peiyao Guo contributed substantially to data analysis, model development, and manuscript revision. Yanping Liu assisted with experimental validation, while Panpan Zhao supported data collection and figure preparation. Ying Wu provided overall supervision, guided the research design, and served as the corresponding author.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Zhejiang Provincial Philosophy and Social Sciences Planning Project (Grant No. 26NDJC035YBMS), the Fundamental Research Funds of Zhejiang Sci-Tech University (Grant No. 26076076-Y), and the Scientific Research Fund of Zhejiang Provincial Education Department (Grant No. Y202558694).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Kumar

. Computer-Vision-Based Fabric Defect Detection: A Survey. IEEE Transactions on Industrial Electronics 2008; 55(1): 348–363. https://doi.org/10.1109/tie.1930.896476

Zhan

Zhou

. Fabric Defect Classification Using Prototypical Network of Few-Shot Learning Algorithm. Computers in Industry 2022; 138: 103628. https://doi.org/10.1016/j.compind.2022.103628

Shady

Gowayed

Abouiiana

, et al. Detection and Classification of Defects in Knitted Fabric Structures. Textile Research Journal 2006; 76(4): 295–300. https://doi.org/10.1177/0040517506053906

Ren

Girshick

, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. Advances in Neural Information Processing Systems 2015; 28.

Lin

Dollár

Girshick

, et al. Feature Pyramid Networks for Object Detection In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2117–2125.

Cai

Vasconcelos

. Cascade R-CNN: Delving into High Quality Object Detection In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018: 6154–6162.

Liu

Anguelov

Erhan

, et al. SSD: Single Shot MultiBox Detector In: Proceedings of the European Conference on Computer Vision (ECCV), 2016, pp. 21–37.

Redmon

Divvala

Girshick

, et al. You Only Look Once: Unified, Real-Time Object Detection In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779–788.

Kang

Kim

. Automatic Recognition of Fabric Weave Patterns by Digital Image Analysis. Textile Research Journal 1999; 69(2): 77–83. https://doi.org/10.1177/004051759906900201

10.

Hanbay

Talu

Özgüven

ÖF

. Fabric Defect Detection Systems and Methods — A Systematic Literature Review. Optik 2016; 127(24): 11960–11973. https://doi.org/10.1016/j.ijleo.2016.09.110

11.

Alata

Ramananjarasoa

. Unsupervised Textured Image Segmentation Using 2-D Quarter Plane Autoregressive Model with Four Prediction Supports. Pattern Recognition Letters 2005; 26(8): 1069–1081. https://doi.org/10.1016/j.patrec.2004.10.002

12.

Zhi

Pang

GKH

Yung

NHC

. Fabric Defect Detection Using Adaptive Wavelet. Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2001; 6: 3697–3700.

13.

Yapi

Allili

Baaziz

. Automatic Fabric Defect Detection Using Learning-Based Local Textural Distributions in the Contourlet Domain. IEEE Transactions on Automation Science and Engineering 2017; 15(3): 1014–1026. https://doi.org/10.1109/tase.2017.2696748

14.

Ismail

Syahrir

Zain

, et al. Fabric Authenticity Method Using Fast Fourier Transformation Detection In: Proceedings of the International Conference on Electrical, Control and Computer Engineering (InECCE), 2011, pp. 233–237.

15.

. Fabric Defect Detection Using Wavelet Decomposition In: Proceedings of the 3rd International Conference on Consumer Electronics, Communications and Networks (CECNet), 2013, pp. 308–311.

16.

Chen

Zeng

Gao

, et al. Adaptive Gabor Filtering for Fabric Defect Inspection. Journal of Computer 2020; 31(2): 45–55.

17.

Gao

Liu

, et al. Defect Detection for Patterned Fabric Images Based on GHOG and Low-Rank Decomposition. IEEE Access 2019; 7: 83962–83973. https://doi.org/10.1109/access.2019.2925196

18.

Abouelela

Abbas

Eldeeb

, et al. Automated Vision System for Localizing Structural Defects in Textile Fabrics. Pattern Recognition Letters 2005; 26(10): 1435–1443. https://doi.org/10.1016/j.patrec.2004.11.016

19.

Chen

Zhi

, et al. Improved Faster R-CNN for Fabric Defect Detection Based on Gabor Filter with Genetic Algorithm Optimization. Computers in Industry 2022; 134: 103551. https://doi.org/10.1016/j.compind.2021.103551

20.

Liu

, et al. Fabric Defects Detection Based on SSD In: Proceedings of the 2nd International Conference on Graphics and Signal Processing, 2018, pp. 74–78.

21.

Zhang

, et al. A Lightweight Model for Digital Printing Fabric Defect Detection Based on YOLOX. Journal of Engineered Fibers and Fabrics 2023; 18: 15589250231208702. https://doi.org/10.1177/15589250231208702

22.

Gao

Zhao

. Research on Textile Defect Detection Algorithm for Deep Learning, 2024.

23.

Tie

Zhu

Zheng

, et al. LSKA-YOLOv8: A Lightweight Steel Surface Defect Detection Algorithm Based on YOLOv8 Improvement. Alexandria Engineering Journal 2024; 109: 201–212. https://doi.org/10.1016/j.aej.2024.08.087

24.

Zhao

Wan

, et al. A Lightweight Algorithm for Steel Surface Defect Detection Using Improved YOLOv8. Scientific Reports 2025; 15: 8966. https://doi.org/10.1038/s41598-025-93469-5

25.

Liu

Qiao

, et al. Lightweight Insulator and Defect Detection Method Based on Improved YOLOv8. Applied Sciences 2024; 14(19): 8691. https://doi.org/10.3390/app14198691

26.

Liu

Wang

, et al. Lightweight Online Detection Method for Potato Surface Defects Based on the Improved YOLOv8n Model. Transactions of the Chinese Society of Agricultural Engineering 2025; 41(5): 135–144.

27.

Jiang

Tang

, et al. Wind Turbine Blade Defect Detection Algorithm Based on Lightweight MES-YOLOv8n. IEEE Sensors Journal 2024; 24(17): 28409–28418. https://doi.org/10.1109/jsen.2024.3430351

28.

Shi

. YOLOv8n-Enhanced PCB Defect Detection: A Lightweight Method Integrating Spatial–Channel Reconstruction and Adaptive Feature Selection. Applied Sciences 2024; 14(17): 7686. https://doi.org/10.3390/app14177686

29.

Wang

Yun

Yang

, et al. OW-YOLO: An Improved YOLOv8s Lightweight Detection Method for Obstructed Walnuts. Agriculture 2025; 15(2): 159. https://doi.org/10.3390/agriculture15020159

30.

Zhao

, et al. DETRs Beat YOLOs on Real-Time Object Detection In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 16965–16974.

31.

Tang

Han

Guo

, et al. GhostNetV2: Enhance Cheap Operation with Long-Range Attention. Advances in Neural Information Processing Systems 2022; 35: 9969–9982.

32.

. Group Normalization In: Proceedings of the European Conference on Computer Vision. ECCV), 2018, pp. 3–19.

33.

Chen

. DEA-Net: Single Image Dehazing Based on Detail-Enhanced Convolution and Content-Guided Attention. IEEE Transactions on Image Processing 2024; 33: 1002–1015. https://doi.org/10.1109/TIP.2024.3354108

34.

Jin

Liu

Nan

, et al. A real-time fabric defect detection method based on improved YOLOv8. Applied Sciences 2025; 15(6): 3228. https://doi.org/10.3390/app15063228

35.

Chen

Zhou

Xiao

, et al. LWFDD-YOLO: A lightweight defect detection algorithm based on improved YOLOv8. Textile Research Journal 2025; 95(9–10): 1125–1142. https://doi.org/10.1177/00405175241285596