Abstract
Signature forgery detection remains a challenge in the field of biometric security. The goal is to develop automated detection systems capable of distinguishing genuine signatures from forged ones with high accuracy. Traditional signature verification methods, which rely on human judgment, are not only error-prone but also lack the speed and scalability required for modern security requirements. In this paper, we present a hybrid approach that combines the feature extraction capabilities of pre-trained deep learning models with the classification accuracy of traditional machine learning algorithms. The deep learning models used in our approach include state-of-the-art convolutional neural networks, which have demonstrated high performance in image recognition tasks. These models extract highly informative features from signature images, capturing both the overall structure and subtle, differentiated variations. After feature extraction, traditional machine learning classifiers are used to classify signatures as genuine or forged. This hybrid method utilizes the strengths of both paradigms, where deep learning models excel at feature extraction, while machine learning classifiers achieve accurate classification. Experimental results on the CEDAR dataset demonstrate that our best-performing hybrid model, VGG16 + SVM, achieves an accuracy of 95.5%, outperforming standalone deep learning models while maintaining computational efficiency. The proposed approach also attains an F1-score of 95.4%, with precision and recall values of 94.8% and 96.2%, respectively. Comparisons with traditional feature-based classifiers further highlight the superiority of our hybrid method in offline signature verification process. Additionally, our method significantly reduces computational costs compared to deep learning-only approaches, making it suitable for real-time applications.
Introduction
Signature forgery detection is a critical challenge in various sectors, such as financial institutions, legal frameworks, and government functions, where the authenticity of signed documents is crucial. Signatures remain one of the most widely used means of identification and validation of consent. However, misuse of them through forgery can have serious consequences, such as financial losses, identity theft, legal disputes, and security breaches.
The evolution of signature forgery techniques has progressively outpaced conventional verification mechanisms, driven by advancements in digital manipulation tools. Manual verification, while effective in isolated cases where experts analyze nuanced features such as stroke curvature, pen pressure, and spacing, suffers from scalability limitations. Its reliance on subjective human judgment introduces inconsistencies and delays, particularly in high-throughput environments requiring real-time decision-making.
Compounding these challenges, the digitization of financial, legal, and commercial workflows has created new attack vectors. Malicious actors now exploit software tools to generate synthetic forgeries or alter genuine signatures with pixel-level precision, rendering traditional visual inspection obsolete. To mitigate these risks, industries demand automated systems that achieve dual objectives: (1) precision in differentiating subtle artifacts of genuine versus forged signatures, and (2) scalability to process high volumes of transactions without compromising speed. Such systems must adapt to both offline and digital signature formats while maintaining resilience against evolving forgery tactics.
Recent advances in machine learning offer promising solutions to these challenges. Neural networks, particularly
Complementing deep learning,
This synergy of deep learning for feature extraction and machine learning for classification aligns with the growing demand for systems that balance accuracy, interpretability, and real-time performance—a gap our methodology directly addresses.
Traditional verification methods lack efficiency, and the complexity of forgery techniques highlights the need for automated detection systems. Deep learning has proven to be a highly effective approach for image analysis, particularly through the use of CNNs capable of identifying complex patterns. However, stand-alone deep learning models often require significant computational resources, limiting their practicality for real-time applications or in resource-constrained environments. In contrast, traditional machine learning algorithms are faster at classification but face challenges when dealing with high-dimensional data.
The present work proposes a hybrid approach that combines the advantages of both deep learning and machine learning. More specifically, Pre-trained CNNs are used to extract high-level, informative features from signature images and traditional machine learning classifiers, such as Support Vector Machines (SVMs) and K-Nearest Neighbors (KNNs), are used for classification. This hybrid framework aims to achieve both high accuracy and low computational cost.
The main contribution of the present paper is the combination of pre-trained convolutional neural networks (CNNs), such as VGG16, 3 DenseNet121, 4 MobileNetV2 5 and EfficientNetV2S, 6 with traditional machine learning classifiers, such as Support Vector Machines (SVMs), 7 K-Nearest Neighbors (KNNs), 8 Decision Trees, 9 Random Forests, 10 Naive Bayes 11 and Logistic Regression. 12 This approach uses CNNs to extract features from signature images, which are then classified with machine learning algorithms. By comparing the performance of different CNNs and machine learning classifiers, we show that this hybrid approach achieves high accuracy and improves the system’s ability to handle real-time data.
Existing approaches to signature forgery detection predominantly focus on either end-to-end deep learning models or traditional machine learning classifiers based on handcrafted features. While deep learning models excel in automated feature extraction, their high computational costs hinder real-time deployment. Conversely, traditional machine learning methods offer computational efficiency but struggle to capture intricate spatial and structural details of signatures.
To address these challenges, this paper presents a
A hybrid methodology that integrates pre-trained Convolutional Neural Networks (CNNs)—including VGG16, DenseNet121, MobileNetV2, and EfficientNetV2S—with traditional classifiers such as Support Vector Machines (SVMs), K-Nearest Neighbors (KNN), and Random Forests. This framework leverages the representational power of deep learning for feature extraction while maintaining the computational efficiency of machine learning for classification. A systematic exploration of multiple CNN architectures and diverse classifiers to identify optimal combinations. Unlike prior studies limited to single-model evaluations, our work benchmarks the performance of hybrid configurations across accuracy, training time, and inference speed. Extensive experiments on the CEDAR dataset demonstrate that the hybrid approach achieves state-of-the-art accuracy (95.5%) with the VGG16+SVM configuration, outperforming standalone deep learning models. The framework also reduces computational overhead by 98% compared to end-to-end CNNs, enabling real-time applicability. Evaluation in offline signature verification scenarios, emphasizing practicality for security-sensitive domains like banking and legal authentication. The proposed method balances precision (94.8%) and recall (96.2%), minimizing both false positives and false negatives. A comprehensive analysis of the accuracy-efficiency trade-off, demonstrating that hybrid models achieve near-optimal performance with significantly lower resource demands. For instance, VGG16+Logistic Regression attains 93.9% accuracy with training times under 2 seconds, making it viable for edge devices.
These contributions set our work apart from existing methodologies and highlight the potential of hybrid deep learning and machine learning strategies in improving signature verification accuracy and efficiency.
The rest of the paper is organized as follows: Section 2 reviews prior work on signature forgery detection. Section 3 provides an overview of CNNs and machine learning classifiers. Section 4 describes the hybrid approach, including preprocessing, feature extraction, and classification. Section 5 presents the experimental results, Section 6 discusses future work, and Section 7 concludes the paper.
Related work
Research on signature forgery detection has explored feature extraction, deep learning, and hybrid approaches. This section reviews key research works, discusses existing gaps, and presents how the proposed method addresses these issues.
What has been done
Previous studies have primarily focused on two major approaches: traditional machine learning and deep learning. Traditional machine learning methods rely on handcrafted feature extraction techniques to differentiate genuine and forged signatures. The study by Elnadree et al. 13 explores traditional feature extraction techniques, emphasizing their impact on accuracy and efficiency. Although these methods are fast, they struggle to capture the intricate details required to distinguish complex forgeries.
Deep learning approaches, particularly convolutional neural networks (CNNs), have shown significant improvements in accuracy. Tan and Le 14 demonstrated the effectiveness of deep learning in extracting hierarchical features from images, with EfficientNet achieving high accuracy but at a high computational cost. Similarly, Kadam et al. 15 employed a deep learning model (Mask R-CNN with MobileNet V1) for detecting image forgeries, highlighting MobileNet’s ability to capture essential features with a reduced computational load. The comparative study on multimodal biometric systems 16 illustrates how combining various techniques can enhance system robustness.
Hybrid approaches, which integrate deep learning with machine learning classifiers, have also been explored. Wang et al. 17 proposed a CNN-based feature extraction technique combined with machine learning classifiers for author-independent signature verification. Similarly, Hafemann et al. 18 considered a hybrid framework but focused on writer-dependent scenarios, limiting its ability to generalize across different datasets.
Data augmentation techniques have been studied to improve generalization. Shorten and Khoshgoftaar 19 provided a comprehensive overview of data augmentation techniques, which are crucial in addressing the challenges of limited datasets in signature verification.
Dynamic signature verification methods, such as those of Diaz et al., 20 incorporate temporal features like pressure and velocity. However, they require specialized hardware, making them impractical for offline applications.
What is still missing
Despite these advancements, several challenges remain unresolved:
Our contributions
To address these challenges, our work introduces a hybrid approach that uses the strengths of both deep learning and traditional machine learning classifiers:
These contributions position our work as an improvement over existing methodologies, addressing computational limitations while maintaining high accuracy in signature forgery detection.
Background knowledge
In this section, we provide an overview of the fundamental technologies on which the proposed hybrid signature forgery detection system is based: deep learning and machine learning. Both of these areas have made progress in recent years, and their combined use allows efficient extraction of intricate patterns from signature images and precise classification as genuine or forged.
Deep learning
Deep learning constitutes a subset of machine learning. It focuses on training layered neural networks to learn features from data. Deep learning models, such as convolutional neural networks (CNNs), 21 are effective in tasks like signature forgery detection by identifying complex patterns and subtle variations that are not easy to detect manually. For signature forgery detection, where tiny variations in movements, line thickness and writing pressure can indicate forgery, CNNs are highly effective. CNNs consist of multiple layers, including aggregation layers, confluence layers, and fully connected layers, each of which captures finer details from the image.
In our hybrid approach, we use several pre-trained CNN models for feature extraction. These models have been trained on large-scale datasets (such as ImageNet 22 ), which allows them to generalize well to new data, including signature images. The models used in our study include:
VGG16 and VGG19,
3
developed by Simonyan and Zisserman, are known for their simple architecture, which consists of a series of convolution layers with small 3
DenseNets, 4 proposed by Huang et al., propose dense connectivity between layers, where each layer receives inputs from all previous layers. This dense connectivity reduces the number of parameters and improves the propagation and reuse of features. DenseNet models are effective in handling complex patterns, as they allow the network to better learn both global and local features.
MobileNet models, 5 designed for efficiency, use depth-separated convolution to reduce the number of parameters without loss of accuracy. MobileNetV1 is the original version which is suitable for mobile and embedded systems where computational resources are limited. MobileNetV1 balances model size, speed, and performance making it ideal for low-latency applications. MobileNetV2 builds on this architecture and incorporates inverted residuals and linear bottlenecks, improving model efficiency and maintaining high accuracy. MobileNetV2 is optimized for resource-constrained environments such as mobile devices. In the context of signature verification, both MobileNetV1 and MobileNetV2 enable fast processing without sacrificing the ability to capture detailed handwriting features, making them suitable for real-time detection systems.
EfficientNetV2 6 is a scalable model that balances depth, width and resolution and offers improved accuracy with fewer parameters compared to previous CNN models. EfficientNetV2S is a small variant which is ideal for signature forgery detection as it efficiently captures the complex visual patterns of cross-sections and is ideal for detecting forged signatures.
The aforementioned pre-trained models are used to extract features from signature images, which are then passed to machine learning classifiers for further analysis. The feature extraction process is crucial because it converts raw pixel data into informative representations that can be effectively used by classifiers to distinguish between genuine and fake signatures.
Machine learning
While deep learning models excel at extracting features from complex data, machine learning algorithms are particularly well suited for classification. In our hybrid system, after extracting features using deep learning models, we pass these features to traditional machine learning classifiers. These classifiers are chosen for their ability to efficiently process structured feature data and make accurate predictions, even when working with high-dimensional inputs derived from CNNs. The machine learning classifiers used in this study include:
SVMs 7 are particularly effective for binary classification tasks, such as distinguishing between genuine and forged signatures. SVMs work by finding the optimal hyperplane that separates two classes in a high-dimensional feature space, maximizing the margin between the closest data points (support vectors) of each class. SVMs are robust classifiers, especially in tasks with complex, non-linear boundaries, which makes them ideal for detecting forged signatures.
KNN8,23 is a well known simple but algorithm that classifies an instance based on the majority class of its nearest neighbors. The algorithm measures the distance between the instance and its neighboring training data instances, usually using the Euclidean distance. KNN is particularly useful for classification tasks where the decision boundary is not linear. When combined with features extracted from deep learning models, KNN can classify signatures based on their distance to known genuine or fake training samples.
Decision trees 9 are hierarchical models that recursively partition the feature space into regions corresponding to different classes. Each tree node represents a decision based on a feature and the branches represent outcomes. In signature forgery detection, decision trees are useful for their interpretability and their ability to handle both categorical and numerical data. They can capture complex decision boundaries.
Random forests 10 generates a set of multiple decision trees during training and aggregates their results to make a final prediction. This reduces the risk of overfitting associated with individual decision trees. Random trees are effective for tasks involving noisy or limited data, such as signature forgery detection, as they provide robust predictions by averaging the results of multiple trees.
Naive Bayes 11 classifier is based on Bayes’ theorem, assuming independence between features. Despite this simplifying assumption, Naive Bayes is fast and performs well when combined with well extracted features. In signature forgery detection, Naive Bayes can classify signatures quickly, making it useful in systems that require real-time processing.
Logistic Regression 12 is a widely used statistical model that is particularly effective for binary classification tasks, such as distinguishing between genuine and forged signatures. Unlike linear regression, which predicts continuous outcomes, logistic regression predicts the probability that a sample belongs to a particular class by assigning the output to a range between 0 and 1 using the sigmoid function. This model works by finding the optimal set of weights for each attribute in the data, essentially creating a linear decision boundary that separates the two classes. In cases where the feature space is high dimensional, logistic regression can still perform well, especially when combined with normalization techniques to avoid overfitting. Logistic regression is valued for its simplicity, interpretability, and effectiveness in training, making it suitable for signature forgery detection tasks where simple and fast classification is required. When used in conjunction with features extracted from deep learning, logistic regression can effectively classify signatures by analyzing how strongly each feature contributes to the likelihood that the signature is genuine or forged.
Each of these machine learning algorithms has specific advantages that make them suitable for classifying signature features extracted from deep learning models. By exploiting the complex, high-level features captured by CNNs, these classifiers can make accurate predictions even with subtle differences between genuine and fake signatures. The combination of deep learning for feature extraction and machine learning for classification allows us to take advantage of the best of both methodologies. Deep learning is appropriate for learning complex representations directly from raw data, such as handwritten signatures. In contrast, traditional machine learning algorithms can efficiently classify these representations, and, at the same time, they offer greater interpretability and faster computation in real-time scenarios.
Proposed methodology
The proposed methodology for signature forgery detection is based on a hybrid approach that integrates deep learning for feature extraction with traditional machine learning classifiers. This combination allows us to utilize the advantages of both approaches: the ability of deep learning to automatically extract complex features from signature images, and the efficiency and accuracy of machine learning in classification tasks. Below, we describe each element of the proposed methodology, from initial data preprocessing to model training and evaluation. Preprocessing is an essential step in our methodology to ensure that the signature images are in optimal condition for feature extraction from deep learning models. The stages of the preprocessing pipeline are described below.
Preprocessing
Preprocessing is essential for standardizing and enhancing the raw signature images to ensure compatibility with the deep learning models. The preprocessing pipeline consists of the following steps:
Feature extraction with pre-trained CNNs
Pre-trained CNNs are a cornerstone of the hybrid methodology, providing powerful feature extraction capabilities. These models leverage transfer learning from the ImageNet dataset, 22 making them highly effective for a variety of image-based tasks, including signature verification.
Classification using machine learning algorithms
Once feature vectors are extracted, traditional machine learning classifiers are employed to classify the signatures as genuine or forged. These classifiers are computationally efficient and work well with the structured feature vectors provided by CNNs.
Hybrid training and evaluation pipeline
The hybrid training and evaluation process is designed to balance accuracy and computational cost. The pipeline is outlined below:
Advantages of the hybrid approach
The proposed hybrid model offers several advantages:
Experimental study
We evaluate the effectiveness of our hybrid approach by conducting experiments on a benchmark dataset. In effect, we evaluate the ability of the proposed methodology to identify genuine and forged signatures accurately and efficiently. The study presents the details of the dataset used, the experimental setup and then the experimental measurements.
Dataset
For our experimental study, we used the CEDAR dataset for offline handwritten signature verification, which is available on Kaggle. This dataset provides a balanced collection of genuine and forged signatures. The CEDAR dataset includes multiple samples from each individual, with genuine signatures and forgeries.
Genuine signatures: Each sample reflects the unique writing style of the signer, with multiple samples per individual to capture natural variations. Forged signatures: Created by impersonators who attempt to copy the style of the original signer. Forgeries vary in accuracy, providing a realistic mix of both specialized and more easily detectable forgeries, thus causing the model to detect subtle differences from genuine samples.
Experimental setup
The dataset was divided into training and test set. 80% of the data is used for training, while 20% is reserved for testing the performance of the models. In addition, a portion of the training set is set aside as a cross-validation set for fine-tuning the hyper-parameters. The performance of the classification models is evaluated using standard classification metrics. These metrics are calculated on the test set, which consists of unseen data to ensure that the model is evaluated in a real scenario. The metrics used are:
Training time
The total time required to train the model on the dataset. This metric helps to assess the computational efficiency of the model. Lower training time can be beneficial, especially for models that need frequent retraining.
Testing time
The total time taken to classify signatures in the test dataset. This metric indicates the model’s responsiveness, particularly important for real-time applications. Faster testing times are preferred, as they suggest a more efficient model in production environments.
Accuracy
The percentage of correctly classified signatures (genuine and forged) out of the total number of predictions. This metric provides an overall view of the model’s performance.
The percentage of correctly identified genuine signatures out of the total number of signatures classified as genuine. This metric helps to understand the model’s ability to avoid falsely labeling genuine signatures as forged.
The percentage of correctly identified forged signatures out of the total number of signatures classified as forged. This helps in evaluating how well the model avoids false positives when detecting forged signatures.
The percentage of actual genuine signatures that were correctly classified as genuine. Recall is known as sensitivity. It indicates the model’s ability to correctly identify genuine signatures.
The percentage of actual forged signatures that were correctly classified as forged. This metric reflects the model’s capability to detect forged signatures and avoid false negatives.
The harmonic mean of precision and recall for genuine signatures. It provides a balanced metric that considers both false positives and false negatives for genuine signatures, summarizing the model’s ability to classify genuine signatures accurately.
The harmonic mean of precision and recall for forged signatures.
In the context of forged signature identification, both types of classification mistakes—false positives (incorrectly predicting a forged signature) and false negatives (incorrectly predicting a genuine signature)—carry significant consequences. If a classifier erroneously identifies a forged signature as genuine, it could facilitate fraud. However, if it mistakenly classifies a genuine signature as forged, it could lead to unjust cancellation of agreements or, worse, wrongful accusations or imprisonment of an innocent person. Therefore, it is difficult to determine whether false positives or false negatives are more critical. Consequently, recall and precision hold equal importance in this context. However, the precision and recall for identifying forged signatures are more critical than those for genuine signatures, as the primary goal is to minimize the risk of fraud while maintaining a fair and reliable classification.
A traditional metric of Root Mean Square Error (RMSE) is not used since its results are equivalent with accuracy. When the prediction output is binary (true/false or 0/1), computing the RMSE or RRMSE is not the most appropriate choice. RMSE primarily measures the average squared deviation between predicted and actual values, which is more relevant for continuous variables rather than discrete binary classification. The deviation is always between 0 and 1 since both
The experiments were performed using Python with
The experiments are conducted in Python, using TensorFlow 28 and Keras 29 for feature extraction via CNNs, 21 and Scikit-learn 30 for training the machine learning classifiers. The experiments run on a machine with GPU support to accelerate feature extraction and training. The hardware configuration included NVIDIA GA107GL [A2 / A16] GPU, 32 cores of Intel(R) Xeon(R) Gold 5218R CPU @ 2.10 GHz, and 128 GB RAM.
In this section, we compare the performance of several deep learning models and the proposed hybrid model. The comparisons focus on two main metrics: Accuracy and execution time (divided into training and testing time). The visualizations in Figures 1 and 2 and 2 illustrate these metrics for each model-criterion combination. Figure 1 shows the symbols and colors used in the following plots.


Accuracy vs training time. (a) Linear scale; (b) Logarithmic scale.
Figure 2 presents the accuracy achieved by each model-judge combination as a function of training time. The figure includes two plots: one on a linear scale and one on a logarithmic scale to provide different perspectives. The hybrid model demonstrates significantly shorter training times than standard deep learning models, with minimal to no sacrifice in accuracy. This makes it an excellent choice for scenarios where computational resources or time constraints are a factor. Although models such as EfficientNetV2 achieve high accuracy, their training times are significantly longer. This highlights a trade-off between achieving the highest possible accuracy and maintaining a reasonable training efficiency. Among classifiers, K-Nearest Neighbors (K-NN) and Decision Tree classifiers generally exhibit shorter training times compared to complex classifiers such as Random Forest, especially when combined with deep learning models. The logarithmic scale plot highlights that the hybrid models based on k-NN classifier is the fastest approach in terms of training time while achieving high accuracy. This is expected since k-NN is a lazy classifier, meaning it does not perform explicit training.
Figure 3 depicts the test time versus accuracy, providing information on the viability of the application in real-time. The figure includes two plots: one on a linear scale and one on a logarithmic scale. The logarithmic scale plot highlights that random forest and decision tree-based models are the fastest in terms of test time, with random forests also achieving higher accuracy. While logistic regression is not as fast as random forests, the combination of DenseNet169 with logistic regression can achieve an accuracy of over 0.95. Similar to training time, deep learning models with high complexity (e.g., EfficientNetV2B0 and DenseNet) tend to have longer test times. This makes them less practical for real-time applications, despite their high accuracy.

Accuracy vs testing time. (a) Linear scale; (b) Logarithmic scale.
Table 1 provides an overview of the performance metrics for VGG16 and VGG19 models also combined with their classifiers. Key metrics to note include training time, testing time, accuracy, precision, recall, and F1-score. Best measurements are in bold. VGG16 and VGG19 alone require significantly higher training and testing times compared to their hybrid versions with traditional classifiers, demonstrating a clear advantage of the hybrid approach for applications requiring faster retraining. VGG16 paired with SVM achieved the highest accuracy, outperforming the deep learning model alone. VGG19 with Logistic Regression also performed well, achieving an high accuracy, which is comparable to the VGG19 deep learning model’s accuracy but with considerably less training time. This data highlights that certain hybrid combinations can match or even exceed the deep learning model’s accuracy while requiring far less computational time. For most classifiers, both VGG16 and VGG19 maintained high precision and recall, particularly for SVM, Random Forest, and K-Nearest Neighbors. While Naive Bayes achieved the lowest accuracy, it performed reasonably well for VGG16 in recall for forged signatures, highlighting that it may still be useful in applications where recall for forged cases is prioritized.
Performance metrics for VGG16 and VGG19 models.
Best values are in bold.
For real-time signature forgery detection applications, the hybrid models stand out due to their minimal training and testing times. Their higher accuracy and the speed advantage they offer is significant in applications requiring quick processing. The table illustrates that for environments where accuracy is paramount, pairing VGG16 or VGG19 with SVM or Logistic Regression is effective. For environments with even more stricter time constraints, Decision Tree or K-Nearest Neighbors paired with VGG16 or VGG19 may be preferable, given their superior speed and reasonable accuracy.
In conclusion, this table demonstrates that hybrid models can achieve a balance of high accuracy and efficiency. Specific combinations, such as VGG16 with SVM or VGG19 with Logistic Regression, offer a robust performance, suggesting that hybrid models can be tailored to meet different application requirements—whether the focus is on maximizing accuracy or achieving real-time responsiveness.
Table 2 provides an in-depth look at DenseNet121, DenseNet169, and DenseNet201 models also combined with their classifiers, evaluating their performance. The best measurements are in bold. Among the DenseNet models, K-Nearest Neighbors consistently achieves the shortest training times across all three architectures. In effect, K-Nearest Neighbors does not built any classification model. It uses the training set as classification model. Thus, it does not need time for training. For testing, Decision Tree shows the fastest testing times. These rapid testing times are beneficial in applications requiring quick model responses. DenseNet121 and DenseNet169 with K-Nearest Neighbors achieve the highest accuracy among their respective models, with DenseNet121 at 0.962 and DenseNet169 at 0.975. This suggests that pairing DenseNet models with K-Nearest Neighbors can yield very high accuracy levels. For DenseNet201, K-Nearest Neighbors also delivers the top accuracy at 0.964. This consistent performance makes it a reliable choice for high accuracy in hybrid models. DenseNet169 with K-Nearest Neighbors achieves perfect precision for forged signatures, showing an exceptional ability to detect forgeries accurately. Across all DenseNet models, K-Nearest Neighbors yields the highest F1-scores, reflecting a strong balance between precision and recall. The combination of DenseNet models with Decision Tree and K-Nearest Neighbors demonstrates significantly lower training and testing times compared to deep learning-only implementations. This makes these hybrids ideal for environments with limited computational resources, such as embedded systems, where rapid processing is essential.
Performance metrics for DenseNet models.
Best values are in bold.
These observations indicate that hybrid models, particularly with K-Nearest Neighbors, offer a compelling balance between accuracy and efficiency. They can deliver competitive performance metrics comparable to deep learning models while requiring significantly less computation, making them suitable for practical deployment in signature forgery detection.
In Table 3, we continue to compare various deep learning models (MobileNet, MobileNetV2, EfficientNetV2B0) with their corresponding hybrid models using different classifiers. Deep learning-only models such as MobileNet and EfficientNetV2B0 require significant computational resources for training and testing. By introducing classifiers like K-Nearest Neighbors, the hybrid models achieve comparable or even higher accuracy with reduced computational demands. For instance, MobileNet with KNN not only achieves high accuracy (0.956) but also has a minimal training time, making it suitable for quick deployment scenarios. For applications where accurately identifying forgeries is paramount, SVM and KNN classifiers excel in balancing precision and recall. The consistently high performance of KNN in precision, recall, and F1-scores across various models indicates that it is particularly well-suited for distinguishing genuine signatures from forged ones. This can be advantageous in financial or security sectors where misclassification of forgeries could lead to significant risks. For real-time or interactive systems that require immediate feedback, models with Decision Tree classifiers are advantageous due to their minimal testing times, despite slightly lower accuracy levels. Alternatively, if accuracy remains a priority, KNN offers a reasonable balance between speed and performance, especially with MobileNet and MobileNetV2 models.
Performance metrics for MobileNet and EfficientNet models.
Best values are in bold.
In conclusion, the tables illustrate that hybrid models combining deep learning and traditional classifiers can effectively improve signature forgery detection, achieving high accuracy, precision, and recall with efficient training and testing times. K-Nearest Neighbors and SVM classifiers stand out as robust choices, each excelling in different contexts depending on the specific requirements for speed and accuracy. These results underscore the flexibility of hybrid models, allowing for optimization based on application-specific needs, whether that be high precision in security-focused applications or faster response times for real-time systems.
Our experimental results show that the hybrid model is a balanced compromise between accuracy and efficiency. Although some deep learning models achieve slightly higher accuracies, the speed advantage of the hybrid model in both the training and testing phases renders it a candidate for real-time applications. One of the key findings of our evaluation is the trade-off between accuracy and speed when using the hybrid model compared to pure deep learning models. Although some deep learning models, such as EfficientNetV2B0 and Xception, achieved slightly higher accuracy, the hybrid model demonstrated significantly faster training and testing times. In particular, some hybrid model configurations outperformed even some deep learning models in terms of accuracy, highlighting the ability of the hybrid approach to combine efficiency with effectiveness. This speed advantage is particularly beneficial for applications such as signature forgery detection, where real- or near-real-time processing is crucial. From a practical point of view, the small sacrifice in accuracy (often within 1–2 percent) is in some cases outweighed by the significant gain in efficiency. For example, in scenarios where the model may need to process thousands of signatures in quick succession, the faster processing speed of the hybrid model can significantly improve usability and responsiveness without compromising detection reliability. Compensation becomes even more valuable in resource-constrained environments, such as mobile or embedded systems, where computing power and battery life are limited. Here, the ability of the hybrid model to provide high accuracy with fewer computational requirements makes it a viable option for real-world deployment. Moreover, the superior accuracy of some hybrid models compared to traditional deep learning models show their suitability for applications that demand both speed and reliability.
Another advantage revealed by the experimental study is the scalability of the hybrid model. Due to the relatively lower training time, the hybrid model is more adaptable to regular retraining with new data. This is crucial because forgery patterns may evolve over time or new types of forgery may emerge. By quickly retraining on updated datasets, the hybrid model can remain effective against threats with a minimum extra computational cost. Additionally, the scalability of the hybrid model improves its adaptability to various datasets. Since training and testing times are low, it becomes possible to adapt the model to different signature styles and languages in different geographical or demographic regions. This adaptability ensures that the model is flexible and can be easily modified, without requiring the extensive computational resources typically needed for deep-learning models.
Future work
Although the proposed hybrid approach has shown promising results in signature forgery detection, several areas remain for future research. First, we aim to extend our evaluation to additional publicly available datasets, such as GPDS, MCYT-75, and SVC2024, to further validate the model’s performance. Expanding the dataset coverage will provide a more comprehensive assessment of our approach’s robustness across different signature styles and forgery techniques.
Another area of research involves investigating few-shot learning techniques to address scenarios with limited labeled signature samples. This would allow the model to perform effectively in real-world applications where collecting large amounts of labeled data is impractical.
In addition, optimizing the computational efficiency of the hybrid framework remains a priority for real-time applications. Future work will explore lightweight deep learning architectures, such as pruned or quantized versions of existing CNNs, to reduce memory and processing demands. Minimizing the computational cost of the model will facilitate deployment in embedded systems, mobile devices, or edge computing platforms, where low latency and energy efficiency are paramount.
Conclusion
In this study, we presented a hybrid approach for signature forgery detection that combines the feature extraction capability of deep learning models with the efficiency of traditional machine learning algorithms. Our methodology uses pre-trained CNN models to extract features from signature images, which are then classified by various machine learning classifiers. In our work, we used SVM, KNN, decision trees, random forests, Naive Bayes, and logistic regression. The hybrid method shows the combined advantages of deep learning and machine learning and offers a fast and accurate classification to distinguish between genuine and forged signatures.
The experimental results show that the hybrid approach achieves high accuracy, precision, recall, and F1-score in various classifier configurations. The results indicate that it effectively captures features of genuine signatures while accurately detecting forgeries. Also, the use of pre-trained CNNs reduces the need for extensive training specific to the given dataset. This improves the efficiency and adaptability of the model to other signature verification tasks. Machine learning classifiers offer fast classification, making the system suitable for real-time applications.
In conclusion, the hybrid approach improves automated signature forgery detection. By integrating deep learning and traditional classifiers, the system provides a balanced solution that is effective in capturing detailed features and efficient enough to be implemented in practical applications.
Footnotes
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
