Abstract
Automatic Identification System (AIS) data plays a critical role in maritime analytics; however, AIS trajectories frequently suffer from noise, inconsistencies, and missing values that degrade their analytical reliability. To address these challenges, this study proposes the Deep Quality Vessel Trajectory Inspector (DQTI), a hybrid variational framework for AIS data quality enhancement and anomaly detection. The proposed model integrates a Variational Recurrent Neural Network (VRNN) with a learnable fusion of GRU and LSTM units, enabling effective modeling of both short-term motion continuity and long-range temporal dependencies in vessel trajectories. A four-hot encoding scheme is adopted to represent longitude, latitude, speed over ground, and course over ground, providing a structured and noise-tolerant representation of multivariate maritime signals. Anomaly detection is performed through a reconstruction-based mechanism that identifies inconsistent AIS messages by measuring deviations between observed and reconstructed trajectories. The experimental evaluation is conducted on more than 141,000 AIS records collected from 800 vessels in the Red Sea. Results demonstrate improved reconstruction accuracy, reduced KL divergence, and more reliable anomaly discrimination compared to baseline models. These findings highlight the effectiveness of data representations and hybrid variational recurrent architectures for enhancing AIS data quality in complex and noisy sequential datasets.
Keywords
Introduction
In recent years, advancements in technology, particularly in artificial intelligence (AI) and deep learning, have been utilized to enhance industry operations and data analysis. These technologies enable organizations to solve complex problems, improve operational efficiency, and strengthen decision-making processes. One area that benefits significantly from these developments is the maritime industry, a critical field that relies on data for navigation, safety, and operational improvement. The Red Sea holds great strategic importance as one of the most important sea routes in the world. The Red Sea region plays a crucial role in global trade, with maritime transport accounting for 80% of the volume of global trade. 1 This strategic waterway connects different countries, providing opportunities for economic cooperation and regional stability. 2 It connects the continents of Asia, Africa, and Europe, facilitating the flow of global trade, through which thousands of vessels pass annually. This makes the region a vital link in the global supply chain, handling large volumes of international trade, including energy exports, raw materials, and consumer goods. The importance of this region has become more prominent in recent times due to operational disruptions and irregular maritime activities, which impact maritime security and trade. 3 Hence, ensuring the safety and efficiency of naval operations in the Red Sea is extremely important.
The Automatic Identification System (AIS) has become an essential tool for monitoring and managing maritime activities. AIS data plays a critical role in maritime surveillance and safety; however, it faces challenges related to data quality. AIS provides real-time vessel information that contributes to collisions avoidance and situational awareness. 4 However, it often suffers from noise, errors, and missing information, which affect its quality and reliability.5,6 These issues can negatively impact decision-making processes due to inaccurate analyses, thereby requiring comprehensive preprocessing before data utilization. 7 Researchers have developed various methods to address these problems, including density-based clustering, deep kernel convolution, and statistical approaches.5,6 Anomaly detection in AIS trajectories is essential for identifying safety and security-related events. 4 Despite these challenges, AIS data analytics have broad potential to support intelligent maritime surveillance systems, contributing to different applications such as vessel tracking, trajectory pattern analysis, and event prediction. 8 Therefore, improving AIS data quality and developing advanced analytical techniques are essential to fully exploit this technology in maritime operations and research. For example, anomalies in ship trajectory data, such as sudden deviations or missing segments, can lead to inaccurate routing, increased fuel consumption, and, in the extreme case, maritime accidents. Furthermore, data quality and prediction accuracy not only impact operational efficiency and navigational safety but also contribute to the sustainability of marine ecosystems. Developing predictive maritime traffic maps enables decision-makers to identify areas of overlap between shipping lanes and sensitive marine habitats, and propose alternative routes or speed limits that reduce the likelihood of collisions and underwater noise. 9 Moreover, integrating trajectory prediction models with biological distribution maps of marine organisms enables the design of more accurate policies to protect marine life in strategic seas such as the Red Sea. 10
However, despite the growing interest in AIS data analytics, several important research gaps remain unresolved, particularly in complex and geographically constrained regions such as the Red Sea. First, many anomaly detection methods still rely on density-based clustering techniques, such as DBSCAN, to extract waypoints and construct route graphs. These approaches are highly sensitive to hyperparameter selection and often fail to assign vessels to consistent route patterns—an issue that becomes more pronounced in the narrow maritime corridors of the Red Sea, where traffic density fluctuates sharply, and vessel behaviors are strongly multimodal. As a result, clustering-based representations frequently produce unstable or inaccurate trajectory structures, limiting their reliability for anomaly detection. Second, existing recurrent neural models such as Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), while effective in capturing temporal dependencies, exhibit notable weaknesses when used independently for AIS trajectory modeling. They struggle with noisy or missing data, suffer from gradient instability, and tend to produce geospatially homogeneous representations that overlook diverse behavioral patterns. These limitations reduce their robustness in environments characterized by irregular sampling, high noise levels, and complex vessel dynamics. Third, traditional real-valued encoding of AIS messages does not sufficiently preserve spatial–dynamic relationships between longitude, latitude, speed, and course. This reduces the model's ability to distinguish fine-grained behavioral deviations, particularly in constrained waterways such as the Red Sea, where small changes in movement can have significant operational implications. These gaps collectively highlight the need for a more stable, noise-tolerant, and spatially coherent representation, along with a recurrent architecture capable of capturing both short- and long-term dependencies in AIS trajectories under uncertainty.
To address these challenges, this study proposes the Deep Quality Vessel Trajectory Inspector (DQTI), a deep learning model developed specifically to improve AIS data quality and detect anomalies in the Red Sea region. The model is based on a Variational Recurrent Neural Network (VRNN) and leverages the complementary strengths of both GRU and LSTM architectures through a novel hybrid mechanism in which the hidden state is computed as a learnable weighted combination of the two units. This design enables the model to capture both short- and long-term dependencies more effectively while enhancing robustness to noise and missing data. Additionally, the model adopts a four-hot encoding technique that preserves spatial and dynamic relationships by discretizing longitude, latitude, speed over ground (SOG), and course over ground (COG). This representation eliminates the instability associated with clustering-based feature extraction and provides a consistent and informative structure for anomaly detection.
This study is based on a dataset collected from MarineTraffic, comprising more than 141,000 AIS records from 800 vessels over two months (July and November 2024), providing a rich database for analysis and evaluation. By detecting anomalies and addressing noise and incomplete data, the proposed model enhances the accuracy and reliability of AIS data, supporting safer and more efficient maritime operations. The findings also provide a foundation for future research aimed at improving AIS data quality across different maritime regions.The main contributions of this study are as follows:
A hybrid variational recurrent architecture that integrates a Variational Recurrent Neural Network (VRNN) with a learnable fusion of GRU and LSTM units, enabling stable latent modeling of noisy and irregular AIS trajectories. A structured four-hot encoding scheme for longitude, latitude, speed over ground, and course over ground, allowing effective representation of multivariate maritime signals under varying spatial and temporal resolutions. A reconstruction-based anomaly analysis framework that improves anomaly discrimination and reduces false-positive detections through chunk-level supervision and latent regularization, validated against GRU, LSTM, and baseline models. Adaptive threshold optimization for anomaly detection using quantile-based estimation from training reconstruction error distribution, with empirical percentile selection via F1-score to reduce false positives and negatives.
The paper is structured as follows: Section 2 presents a review of related work. Section 3 details the dataset and methodology used in this study. Section 4 discusses the results and highlights the proposed model's performance. Finally, Section 5 provides the conclusion, summarizing the main findings of the proposed approach.
Related work
In recent years, scientists and researchers have focused on studying the quality of AIS data, due to its important role in maritime safety, navigation, and anomaly detection. Ensuring the quality of AIS data is critical to maritime safety, as poor-quality data, including tampering or spoofing, can confuse vessel tracking and increase the risk of accidents. 11 Another study highlighted the importance of continuous monitoring of AIS data, especially with the rise of autonomous shipping, which requires highly accurate and reliable data. 12 In addition, AIS data manipulation observed in December 2019 near the island of Elba, demonstrated the vulnerability of the system to cyberattacks, highlighting the need for data integrity in maritime operations. 11 Ship traffic off the coast of Portugal was studied and analyzed, with 1766 potential collisions identified over a 32-day period, highlighting the importance of quality AIS data in enhancing navigational safety. 13 Through these studies and real-life cases, it is clear that maintaining accurate AIS data is essential for reducing maritime incidents and enhancing operational efficiency. AIS has become a primary source of vessel movement data, enabling a wide range of applications in maritime research and contributing to the digitalization of the maritime sector. 14 Common issues in AIS data include noise, outliers, duplicates, and inconsistent or missing data.7,15,16 Detecting anomalies in AIS data has gained attention due to its potential to improve safety and security. 4 Numerous approaches have been proposed to address AIS data quality challenges, including visual analytics, 15 machine learning techniques, 17 and data quality control procedures. 18
Traditional methods, such as rule-based approaches, provide interpretability, as they rely on pre-defined rules that are easy to understand and interpret. However, these methods face challenges in identifying comprehensive anomaly lists, as they rely on fixed rules that may not cover all possible scenarios. They also face difficulty handling relative terms such as “fast” or “slow,” whose interpretations may vary depending on the operational context or vessel type. These limitations reduce their effectiveness in operational settings. These limitations motivated the adoption of deep learning, which has become a preferred approach due to its high flexibility, scalability to process massive amounts of data, and ability to automatically detect anomaly patterns without the need for explicit rules, making it more suitable for modern maritime applications. 19 Among deep learning techniques, Recurrent Neural Networks (RNNs) and their variants such as LSTM, GRU, and encoder-decoder architectures, have been widely applied to predict vessel trajectories and detect anomalies in AIS data.20–23 While these models capture temporal dependencies effectively, they often face challenges in representing uncertainty, which reduces their ability to handle noisy and incomplete AIS data.24,25
To overcome some of the limitations of RNNs, researchers have introduced Variational Recurrent Neural Networks (VRNNs), which combine probabilistic modeling with recurrent structures. This approach helps in capture complex patterns over time and handle uncertainty in multivariate time series.26,27 VRNNs have been widely used in applications that involve sequential data, showing their ability to learn latent representations while reconstructing corrupted or missing data. 28 This makes them particularly useful for AIS data, where tracking gaps and inconsistencies are common. Several studies have applied VRNNs to detect anomalies in AIS data, taking advantage of their strength in modeling sequential relationships.29,30 Transformer-based models, such as Transformer–VAE, 31 have recently been applied to AIS trajectory analysis due to their ability to capture long-range dependencies through self-attention mechanisms. Unlike recurrent models, Transformers enable parallel processing and improved modeling of global contextual relationships across vessel trajectories. However, despite these advantages, Transformer-based approaches often rely on raw continuous feature representations, which may fail to preserve critical spatial–dynamic relationships between longitude, latitude, speed, and course. Moreover, they are sensitive to noise and irregular sampling, which are inherent characteristics of AIS data. These limitations can reduce their effectiveness in anomaly detection tasks, particularly in complex maritime environments. These limitations motivate the need for a framework that combines robust temporal modeling with structured feature representation, which is one of the main design principles of the proposed DQTI model.
In addition to deep generative models, feature encoding techniques play an essential role in enhancing data representation for machine learning tasks, such as anomaly detection or maritime route prediction. Feature encoding refers to the process of transforming raw data (such as longitude, latitude, speed, and course) into structured numerical representations that a model can understand and process effectively. Traditional encoding methods face significant challenges. They often fail to preserve spatial relationships (such as geographic proximity between points) or categorical relationships (such as vessel type or operational behavior), which can lead to the loss of important information and reduce the accuracy of models.
To overcome these challenges, recent studies have explored multi-hot encoding methods, including one-hot and four-hot encoding, to improve feature representations in trajectory modeling. The four-hot encoding captures dependencies across multiple feature dimensions, allowing models to extract more contextual information from AIS records.19,32 By encoding longitude, latitude, speed over ground, and course over ground into distinct yet related feature spaces, this method has been shown to enhance data clustering and improve anomaly detection in vessel movement analysis. In line with the above review of the development of methods used in anomaly detection in maritime data, it is useful to provide a direct comparison between some of the prominent works discussed in this context to clarify the characteristics, key features, and shortcomings of each. The following table (Table 1) summarizes a group of these studies and highlights how our proposed model could address some of these challenges.
Comparison between previous studies and the proposed model.
Comparison between previous studies and the proposed model.
Despite the significant progress achieved in AIS anomaly detection, several critical limitations remain. First, clustering-based methods are highly sensitive to parameter selection and often fail in complex and geographically constrained maritime environments. Second, recurrent models such as LSTM and GRU struggle to effectively capture uncertainty under noisy and incomplete AIS conditions, which limits their robustness. Third, Transformer-based approaches, while effective in modeling long-range dependencies, rely heavily on raw feature representations and are sensitive to irregular AIS sampling. Finally, traditional feature encoding methods often fail to preserve the underlying spatial–dynamic relationships between vessel movement attributes. These limitations collectively highlight the need for a robust and noise-tolerant framework that combines probabilistic temporal modeling with structured spatial–dynamic representation for AIS data quality enhancement and anomaly detection.
To address the challenges of noisy and incomplete AIS data, this study proposes a deep learning-based anomaly detection model tailored for maritime trajectories. The proposed model combines GRU and LSTM architectures with four-hot encoding, to enhance the accuracy and reliability of anomaly detection. This method improves the identification of both spatial and dynamic anomalies in AIS data, ensuring more accurate and consistent results.
Data acquisition and preprocessing
This study is based on vessel data collected from MarineTraffic. This specialized and reliable commercial platform provides AIS data. In the first phase, the target geographic area—the Red Sea—was precisely defined using the platform's advanced geolocation tools. This step involved identifying the geographic coordinates representing the boundaries of the Red Sea region to ensure that the data was limited to this vital area with intense maritime activity. Next, a subscription was made to the paid MarineTraffic service to access high-resolution AIS data with full spatial and temporal attributes. The temporal scope of the dataset was defined to include two non-consecutive months (July and November 2024) in order to capture different operational conditions while maintaining a manageable and methodologically consistent dataset for trajectory-level anomaly detection. The vessel type and file format were carefully selected to ensure suitability for subsequent processing. This process required careful monitoring to ensure the completeness of the downloaded files and that no records were missing during the data acquisition stage. The data extracted from the platform is high-resolution dynamic-spatial data, including AIS message logs that continuously monitor the position and movement of vessels in the Red Sea. It included information on 800 vessels, including cargo, tankers, and passenger vessels. This dataset consists of more than 141,000 tracking messages, each representing a real-time observation of the vessel's status at a specific time. Each message contains a set of fields, the most important of which are: the timestamp of observations, latitude, longitude, SOG, COG, and a unique vessel identifier, as described in Table 2.
Key fields used to represent vessel tracking data
Key fields used to represent vessel tracking data
After the data collection process was completed, the raw data was pre-processed to ensure its quality and usability. Invalid or missing values, such as NaN or text placeholders such as “masked”, are removed. Fields containing numeric values are converted to floating-point format to prepare them for further analysis. All ship coordinates are verified to be within the Red Sea region by restricting the geographic coordinates to longitudes between 32° and 44°, and latitudes between 12° and 33°. This ensures that all ships in the dataset are in the Red Sea region. To ensure that the chronological order of messages or observations is maintained, the dataset is sorted by timestamp, and the index is reset. Unique ship identifiers are used to represent individual tracks. Table 3 shows a sample of the vessel data used, which contains vessel types, vessel ID (MMSI), timestamp, geographical coordinates that fall within the Red Sea boundaries, as well as the SOG and COG.
A sample of AIS data for vessel movements in the red sea area
After preprocessing, AIS messages are transformed using a four-hot encoding scheme, where longitude, latitude, speed over ground (SOG), and course over ground (COG) are discretized into predefined bins and encoded independently as binary vectors, as shown in Figure 1. The resulting vectors are then concatenated into a unified representation, providing a structured and noise-tolerant encoding of vessel trajectories. Unlike one-hot encoding, which represents a single categorical variable independently, the proposed four-hot encoding jointly represents four key AIS attributes—longitude, latitude, speed over ground (SOG), and course over ground (COG)—in a structured multi-dimensional form. This design allows the model to capture both spatial and dynamic dependencies simultaneously, rather than treating each feature in isolation. Compared with raw continuous inputs, this representation preserves spatial and dynamic relationships more effectively, enabling improved representation learning and anomaly detection performance.

Transformation of input to four-hot encoding representation.
Following this representation, the encoded AIS trajectories are used as inputs to the proposed DQTI model for temporal modeling and anomaly detection. We evaluated two binning configurations to balance encoding resolution with computational efficiency. The standard configuration, the 260-bin configuration, consists of 100 bins for longitude, 100 bins for latitude, 50 bins for SOG, and 10 bins for COG. The second configuration, the 560-bin configuration, provides 200 bins for longitude, 300 bins for latitude, 50 bins for SOG, and 10 bins for COG. While the higher-resolution 560-bin configuration captures finer details, the 260-bin configuration demonstrated superior performance, particularly due to the geographical characteristics of the Red Sea. Being a relatively narrow waterway—approximately 355 kilometers wide at its widest point and narrowing to about 26–29 kilometers at the Bab-el-Mandeb Strait—the 260-bin configuration offers sufficient resolution for anomaly detection while maintaining computational efficiency. Experimental findings show that the standard configuration is more suitable for the Red Sea's narrow geographical structure, as it achieves an effective balance between spatial granularity and computational efficiency. To further enhance the model's ability to capture temporal dynamics, the vessel trajectory is segmented into fixed-length overlapping sequences using a sliding window approach. Given a trajectory:
Segments of length T = 10 are generated with a stride of 1, resulting in multiple overlapping subsequences. In this context, the sequence length T defines the Chunk Size, which is set to 10 time steps, as shown in Figure 2. Each segment (or chunk) is then transformed using the four-hot encoding scheme, producing a tensor of shape (T, D), where D denotes the encoding dimension (D = 260).

Sliding window-based trajectory segmentation.
Figure 3 illustrates the architecture of the proposed model based on the VRNN. VRNNs are an effective approach in handling time series data of ship trajectories, making them suitable for capturing vessel movement behavior and detecting anomalies. This approach enhances traditional LSTM and GRU models by incorporating probabilistic components. These additions improve the ability of the model to represent complex distributions and predict anomalies more accurately.

VRNN-Based anomaly detection process flow.
In this model, the encoder learns the probabilistic distribution of the latent variable
Where
Here,
This distribution ensures that the model can generate new data and predict temporal patterns using only the hidden state, without relying on the input data directly. The model maintains a probabilistic approach to latent variable generation while leveraging recurrent layers to capture temporal dependencies. The decoder reconstructs the input data
Here,
This loss measures how the model can reconstruct the input data given the latent variable and hidden state. Also, encourages the model to learn meaningful representations in the latent space. In the DQTI model, the hidden state
Where
The formula for KL divergence between two Gaussian distributions
Where
Where
After calculating the Anomaly Score, the Anomaly Rate can be determined, showing the percentage of time points that contain anomalies in the trajectories. This is calculated by comparing the difference between the actual and reconstructed positions of the ship to a specific threshold, using the following equation:
Where T is the total number of time points. This formulation provides a quantitative measure of abnormal behavior in vessel trajectories by capturing deviations from learned normal patterns.
Given the anomaly score defined as the Euclidean distance in Eq. (10), anomaly detection is performed by comparing the reconstruction error against a predefined threshold. This threshold determines whether a given trajectory point is considered normal or anomalous. The anomaly threshold is estimated using a quantile-based approach derived from the distribution of reconstruction errors. Specifically, a percentile-based cutoff is selected to distinguish between normal and anomalous trajectory points. The optimal percentile is selected empirically using performance metrics such as the F1-score, ensuring a balance between false positives and false negatives. This results in a robust and adaptive anomaly detection mechanism that adjusts to the underlying data distribution.
Baseline models for comparison
For comparative evaluation, two benchmark models were selected. The first is Transformer–VAE, 31 which integrates a Transformer architecture with a Variational Autoencoder and is trained on raw feature values. The second is GeoTrackNet, 19 a state-of-the-art VRNN-based model that utilizes four-hot encoding for maritime trajectory representation. To ensure fairness in comparison, consistent hyperparameters were maintained across all models, including a batch size of 32, a learning rate of 1e − 3 using the AdamW optimizer, and 200 training epochs.
Experimental evaluation
To evaluate the proposed model's performance, an experimental dataset was constructed by injecting synthetic anomalies and noise into 30% of the total trajectories. These anomalies were generated by perturbing key AIS attributes, including spatial position, speed over ground (SOG), and course over ground (COG), to simulate realistic abnormal vessel behaviors such as route deviation, abrupt speed changes, and erratic heading variations. This process provides a controlled benchmark for assessing detection capability. All models were trained using the AdamW optimizer, with trajectories segmented into chunks of ten time steps. Quantitative evaluation was conducted using multiple metrics, including reconstruction loss, KL divergence, total loss, anomaly rate, precision, recall, and F1-score. These metrics collectively assess each model's ability to accurately detect anomalous trajectories while preserving reliable reconstruction of normal vessel behavior. Model performance was continuously monitored throughout the training process, and comparative analysis was performed across baseline models and encoding configurations to evaluate robustness and scalability.
Experiment and results
This section presents the performance results for a proposed high-quality vessel route inspection DQTI model designed to detect anomalies in vessels’ trajectories. We compare the proposed models (DQTI-GRU, DQTI-LSTM, and DQTI-Hybrid) with two recent benchmark models: Transformer-VAE and GeoTrackNet. The evaluation focuses on trajectory reconstruction accuracy, latent representation quality, anomaly detection performance, and scalability.
Dataset and model parameters
This part describes the AIS trajectory dataset used for training, validation, and testing, all sourced from the Red Sea. As detailed in Section 3.1, the dataset comprises approximately 141,000 AIS messages from 800 vessels, collected during July and November 2024. The dataset underwent a series of preprocessing steps, including noise filtering, removal of stationary vessels (speed ¡ 1 knot), trajectory segmentation, and, for relevant models, four-hot encoding. The four-hot encoding scheme, which discretizes Longitude, Latitude, Speed Over Ground (SOG), and Course Over Ground (COG), was applied at two distinct resolutions to evaluate its scalability: a standard resolution (260-bin) and a high resolution (560-bin). The Transformer–VAE model was trained on the raw (unencoded) values of these four features.
Table 4 summarizes the key characteristics of the preprocessed dataset. The data was partitioned chronologically to prevent data leakage, with 70% used for training, 15% for validation, and the remaining 15% for testing. All models were trained using a chunking strategy, where trajectories were divided into smaller sequences (chunks) of 10 time steps to effectively handle irregularities and inconsistencies. Consistent hyperparameters were used to ensure a fair comparison: a batch size of 32, an initial learning rate of 1e-3 with the AdamW optimizer, and a maximum of 200 training epochs
Summary of the preprocessed AIS trajectory dataset from the red sea.
Summary of the preprocessed AIS trajectory dataset from the red sea.

Training loss curves for all models over 200 epochs.
This subsection compares the learning efficiency and stability of all models during the training phase. Figure 4 illustrates the training loss curves over 200 epochs, providing insight into each model's convergence behavior (using 260-bin configuration for the four-hot based models). The Transformer–VAE model exhibits slower convergence compared to the DQTI variants, stabilizing after approximately 150 epochs with some noticeable fluctuations. GeoTrackNet shows a significantly higher loss and much slower convergence, struggling to learn effectively from the four-hot encoded data. Among the recurrent models, DQTI-GRU and DQTI-LSTM show steady improvement but with noticeable fluctuations. The proposed DQTI-Hybrid model clearly outperforms all others, achieving the fastest convergence and the lowest final training loss. Its curve descends sharply and stabilizes with minimal oscillation, indicating superior training efficiency and robustness.
Table 5 provides a quantitative summary of the training performance, confirming the visual observations from Figure 4. The DQTI-Hybrid model achieves the lowest final training loss (2.69), reconstruction loss (2.66), and KL divergence (0.03). The low KL divergence indicates a stable and well-regularized latent representation of normal vessel behavior. This represents a significant improvement over the baselines, outperforming Trans- former–VAE which achieves a loss of 4.80, and vastly exceeding GeoTrackNet which exhibits an excessively high loss (134.22).
Comprehensive training performance comparison for all models (using 260-bin configuration for four-hot models).
Comprehensive training performance comparison for all models (using 260-bin configuration for four-hot models).

Visual comparison of reconstruction performance. The DQTI-Hybrid model demonstrates the highest fidelity in capturing fine-grained movement patterns compared to the smoother, less accurate baselines.
Having confirmed training efficiency, we proceed to evaluate the models’ ability to accurately reconstruct vessel trajectories. This capability is fundamental, as our model operates on the principle that a normal trajectory can be reconstructed with low error, while an anomalous trajectory will yield a high reconstruction error. Figure 5 presents a visual comparison between a ground truth trajectory segment (in blue) and its reconstructions from each model (in orange) for the four primary features.
It is clearly observed that the Transformer–VAE model reconstructs the trajectory in an overly smooth manner and fails to capture fine details and sharp transitions, particularly in speed and course profiles. This suggests that raw data representation may be insufficient for learning the complex dynamics of vessel motion. GeoTrackNet slightly improves the result but still struggles with abrupt changes. In contrast, all DQTI models, and especially the hybrid DQTI-Hybrid model, demonstrate a superior ability to follow the original trajectory with high fidelity. Our proposed model successfully reconstructs changes in speed and course almost identically to the original, indicating that it has effectively learned the natural motion patterns.
These visual observations are supported by the quantitative results in Table 6, where we computed the Mean Squared Error (MSE) for reconstruction. The DQTI-Hybrid achieves the lowest error (0.021), significantly outperforming Transformer–VAE (0.058) and GeoTrackNet (0.046). This confirms the hybrid model's capacity to learn and reproduce the complex spatiotemporal patterns in AIS data with high efficiency.
Quantitative trajectory reconstruction performance (260-bin configuration for relevant models)
Quantitative trajectory reconstruction performance (260-bin configuration for relevant models)
Having established DQTI-Hybrid as the optimal architecture, we analyze the impact of four-hot encoding resolution on its performance. Table 7 compares the model's performance using the standard (260-bin) and high-resolution (560-bin) encoding schemes. Although the high-resolution encoding offers greater granularity, it introduces significant challenges. The total loss of the DQTI-Hybrid model increases from 2.69 to 5.19, and the KL divergence rises from 0.03 to 0.09, indicating a less stable latent space. More importantly, the anomaly detection rate during testing increases from a very low 0.5% to 1.1%. This suggests that the 560-dim encoding may capture fine-grained noise as salient features, leading to a slightly higher false positive rate without a corresponding gain in detecting genuine anomalies. The 260-bin configuration provides the best balance between accuracy, computational efficiency, and a practical anomaly detection rate for the geographical characteristics of the Red Sea.
Performance comparison of DQTI-hybrid with different encoding resolutions.
Performance comparison of DQTI-hybrid with different encoding resolutions.
Figure 6 plots the reconstruction loss per epoch for both encoding resolutions. The 560-bin configuration shows higher loss throughout training with more pronounced fluctuations. The standard 260-bin configuration provides a smoother and more stable training curve, confirming its suitability for this application.

Reconstruction loss for both encoding resolutions.
A critical step in anomaly detection is selecting an optimal threshold to distinguish between normal and anomalous behavior. We analyze the distribution of reconstruction errors on the training and test sets using the DQTI-Hybrid model with the standard 260-bin configuration. Figure 7 shows the distribution of these errors. The training set errors (representing normal behavior) are tightly clustered around low values, while the test set exhibits a long tail, indicating the presence of anomalous trajectories with high reconstruction errors. This separation motivates the use of a percentile-based threshold on the training set errors.

Distribution of reconstruction errors for training and test sets. The training set (blue) represents normal behavior and is tightly clustered. The test set (orange) shows a long tail, suggesting the presence of anomalous samples with high reconstruction error.
We evaluate three candidate thresholds corresponding to the 90th, 95th, and 98th percentiles of the training error distribution. Table 8 presents the precision, recall, and F1-score for anomaly detection at each threshold. The 98th percentile yields high precision but very low recall, missing many true anomalies. The 90th percentile captures more anomalies but at the cost of significantly lower precision (more false positives). The 95th percentile provides the best trade-off, achieving the highest F1-score (0.92) and is therefore selected as the optimal threshold for all subsequent experiments.
Anomaly detection performance for DQTI-hybrid at different thresholds (260-bin configuration).
Using the optimal 95th percentile threshold, we compare the anomaly detection performance of all models on the test set. Figure 8 shows the anomaly detection error rate over training epochs, demonstrating that DQTI-Hybrid quickly establishes a low and stable error rate compared to other models.

Anomaly detection error rate under the 260-bin configuration.
Table 9 provides a comprehensive comparison using the standard 260-bin configuration for models that utilize it. The DQTI-Hybrid model achieves the highest scores across all metrics, with a Precision of 0.94, Recall of 0.91, and F1-Score of 0.92. This represents a substantial improvement over the baselines, particularly in recall, indicating its superior ability to correctly identify true anomalous trajectories. The proposed model clearly outperforms Transformer–VAE, which achieves an F1-score of 0.68, confirming the effectiveness of the four-hot representation and the proposed hybrid architecture.
Cross-Model anomaly detection comparison (260-bin configuration for relevant models, 95th percentile threshold).
Figure 9 presents the confusion matrices for all models, visually demonstrating the superior performance of the DQTI-Hybrid, which correctly classifies the vast majority of both normal and anomalous trajectories. It can be observed that Transformer–VAE fails to identify a considerable number of anomalies (high false negatives), while GeoTrackNet suffers from an excessive number of false positives.

Confusion matrices for all models.
A qualitative analysis was performed on the 34 anomalous trajectories successfully detected by the DQTI-Hybrid model (from the test set using the 260-bin configuration). These were categorized into three primary behavioral patterns, as summarized in Table 10. The model proved highly effective in detecting various forms of genuine irregular patterns. Figure 10 provides a visual example for each anomaly type, illustrating the model's ability to pinpoint the exact location and nature of the deviant behavior.

Visual examples of detected anomalies: (a) spatial deviation, (b) atypical speed, (c) course disruption. The model successfully isolates segments (highlighted in red) that deviate from learned normal patterns.
Categorization of anomalies detected by the DQTI-hybrid model.
Figure 11 and Figure 12 illustrate a comparison between the representation of normal and anomalous paths using the DQTI-Hybrid model. The normal path (Figure 11) exhibits high agreement between the original path and the reconstruction, with a stable and coherent latent representation. In contrast, the anomalous path (Figure 12) shows clear deviations and a high reconstruction error, with points scattered in the latent space outside the normal range. These results confirm the model's ability to effectively distinguish between normal and abnormal behavior based on the properties of the reconstruction and the latent representation.

DQTI-Hybrid representation of a normal trajectory.

DQTI-Hybrid representation of an anomalous trajectory.
The experimental results demonstrate that the proposed DQTI-Hybrid model provides a more efficient, stable, and reliable solution for reconstruction-based anomaly detection and AIS data quality assessment than all baseline models, including the Transformer–VAE which relies on raw inputs and the VRNN-based GeoTrackNet. Its superior performance is attributed to several complementary design choices: the hybrid recurrent architecture enhances temporal modeling across varying time scales, explicit reconstruction supervision strengthens the reliability of the learned normal-behavior baseline, the chunking strategy improves training stability on irregular AIS streams, and the four-hot encoding proves superior to raw feature representation in capturing spatial and dynamic relationships. The 260-bin four-hot encoding scheme, combined with the DQTI-Hybrid architecture and the 95th percentile anomaly threshold, offers the best balance between reconstruction accuracy, latent stability, and a practical, high-precision anomaly detection rate for the complex maritime environment of the Red Sea.
Conclusion
This study proposed DQTI, a hybrid VRNN-based framework for AIS data quality enhancement and anomaly detection in vessel trajectories. By combining a learnable GRU–LSTM fusion mechanism with four-hot encoding and reconstruction-based anomaly analysis, the proposed model achieved superior performance compared with benchmark models, including Transformer–VAE and GeoTrackNet. Experimental results on more than 141,000 AIS records from 800 vessels in the Red Sea showed that DQTI-Hybrid achieved the lowest total loss (2.69), the lowest KL divergence (0.03), and the highest anomaly detection performance with a precision of 0.94, recall of 0.91, and F1-score of 0.92. These findings confirm that the proposed framework provides a stable latent representation and more reliable separation between normal and anomalous vessel behavior. In addition, the 260-bin four-hot configuration provided the best balance between spatial resolution, latent stability, and practical anomaly detection performance, making it more suitable than the higher-resolution 560-bin configuration for the geographical characteristics of the Red Sea. The results also demonstrate that the reduction in anomaly rate reflects improved discrimination capability rather than suppression of abnormal behavior, which strengthens the practical relevance of the model for maritime monitoring applications. Although the present study focused on the Red Sea, the proposed framework can be adapted to other maritime regions. Future work will investigate cross-region validation, real-time anomaly monitoring, and the integration of contextual variables such as weather, sea state, and traffic density to further improve robustness under diverse navigational conditions.
Footnotes
Acknowledgements
None.
Ethical considerations
Not applicable. This study does not involve human participants or animals and relies solely on publicly available AIS vessel trajectory data.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Author contributions
The authors confirm contribution to the paper as follows: Conceptualization, Jawaher Alqahtani, Ayman Yafoz, and Mohamed El-Eliemy; methodology, Jawaher Alqahtani and Mohamed El-Eliemy; software, Jawaher Alqahtani; validation, Jawaher Alqahtani, Ayman Yafoz, and Mohamed El-Eliemy; formal analysis, Jawaher Alqahtani; investigation, Jawaher Alqahtani, Ayman Yafoz, and Mohamed El-Eliemy; resources, Jawaher Alqahtani; data curation, Jawaher Alqahtani; writing—original draft preparation, Jawaher Alqahtani; writing—review and editing, Jawaher Alqahtani, Ayman Yafoz, and Mohamed Hamdy El-Eliemy; visualization, Jawaher Alqahtani; supervision, Ayman Yafoz and Mohamed El-Eliemy; project administration, Ayman Yafoz and Mohamed El-Eliemy. All authors have read and approved the final version of the manuscript.
Funding
The project was funded by KAU Endowment (WAQF) at king Abdulaziz University, Jeddah, Saudi Arabia. The authors, therefore, acknowledge with thanks WAQF and the Deanship of Scientific Research (DSR) for technical and financial support.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability
The data used in this study are available from the corresponding author upon reasonable request.
Availability of data and materials
The data that support the findings of this study are available from the Corresponding Author, J.A., upon reasonable request.
