A hybrid variational framework for AIS data quality enhancement and anomaly detection

Abstract

Automatic Identification System (AIS) data plays a critical role in maritime analytics; however, AIS trajectories frequently suffer from noise, inconsistencies, and missing values that degrade their analytical reliability. To address these challenges, this study proposes the Deep Quality Vessel Trajectory Inspector (DQTI), a hybrid variational framework for AIS data quality enhancement and anomaly detection. The proposed model integrates a Variational Recurrent Neural Network (VRNN) with a learnable fusion of GRU and LSTM units, enabling effective modeling of both short-term motion continuity and long-range temporal dependencies in vessel trajectories. A four-hot encoding scheme is adopted to represent longitude, latitude, speed over ground, and course over ground, providing a structured and noise-tolerant representation of multivariate maritime signals. Anomaly detection is performed through a reconstruction-based mechanism that identifies inconsistent AIS messages by measuring deviations between observed and reconstructed trajectories. The experimental evaluation is conducted on more than 141,000 AIS records collected from 800 vessels in the Red Sea. Results demonstrate improved reconstruction accuracy, reduced KL divergence, and more reliable anomaly discrimination compared to baseline models. These findings highlight the effectiveness of data representations and hybrid variational recurrent architectures for enhancing AIS data quality in complex and noisy sequential datasets.

Keywords

AIS data quality trajectory anomaly detection deep learning variational models probabilistic modeling red sea

1 Introduction

In recent years, advancements in technology, particularly in artificial intelligence (AI) and deep learning, have been utilized to enhance industry operations and data analysis. These technologies enable organizations to solve complex problems, improve operational efficiency, and strengthen decision-making processes. One area that benefits significantly from these developments is the maritime industry, a critical field that relies on data for navigation, safety, and operational improvement. The Red Sea holds great strategic importance as one of the most important sea routes in the world. The Red Sea region plays a crucial role in global trade, with maritime transport accounting for 80% of the volume of global trade.¹ This strategic waterway connects different countries, providing opportunities for economic cooperation and regional stability.² It connects the continents of Asia, Africa, and Europe, facilitating the flow of global trade, through which thousands of vessels pass annually. This makes the region a vital link in the global supply chain, handling large volumes of international trade, including energy exports, raw materials, and consumer goods. The importance of this region has become more prominent in recent times due to operational disruptions and irregular maritime activities, which impact maritime security and trade.³ Hence, ensuring the safety and efficiency of naval operations in the Red Sea is extremely important.

The Automatic Identification System (AIS) has become an essential tool for monitoring and managing maritime activities. AIS data plays a critical role in maritime surveillance and safety; however, it faces challenges related to data quality. AIS provides real-time vessel information that contributes to collisions avoidance and situational awareness.⁴ However, it often suffers from noise, errors, and missing information, which affect its quality and reliability.^5,6 These issues can negatively impact decision-making processes due to inaccurate analyses, thereby requiring comprehensive preprocessing before data utilization.⁷ Researchers have developed various methods to address these problems, including density-based clustering, deep kernel convolution, and statistical approaches.^5,6 Anomaly detection in AIS trajectories is essential for identifying safety and security-related events.⁴ Despite these challenges, AIS data analytics have broad potential to support intelligent maritime surveillance systems, contributing to different applications such as vessel tracking, trajectory pattern analysis, and event prediction.⁸ Therefore, improving AIS data quality and developing advanced analytical techniques are essential to fully exploit this technology in maritime operations and research. For example, anomalies in ship trajectory data, such as sudden deviations or missing segments, can lead to inaccurate routing, increased fuel consumption, and, in the extreme case, maritime accidents. Furthermore, data quality and prediction accuracy not only impact operational efficiency and navigational safety but also contribute to the sustainability of marine ecosystems. Developing predictive maritime traffic maps enables decision-makers to identify areas of overlap between shipping lanes and sensitive marine habitats, and propose alternative routes or speed limits that reduce the likelihood of collisions and underwater noise.⁹ Moreover, integrating trajectory prediction models with biological distribution maps of marine organisms enables the design of more accurate policies to protect marine life in strategic seas such as the Red Sea.¹⁰

However, despite the growing interest in AIS data analytics, several important research gaps remain unresolved, particularly in complex and geographically constrained regions such as the Red Sea. First, many anomaly detection methods still rely on density-based clustering techniques, such as DBSCAN, to extract waypoints and construct route graphs. These approaches are highly sensitive to hyperparameter selection and often fail to assign vessels to consistent route patterns—an issue that becomes more pronounced in the narrow maritime corridors of the Red Sea, where traffic density fluctuates sharply, and vessel behaviors are strongly multimodal. As a result, clustering-based representations frequently produce unstable or inaccurate trajectory structures, limiting their reliability for anomaly detection. Second, existing recurrent neural models such as Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), while effective in capturing temporal dependencies, exhibit notable weaknesses when used independently for AIS trajectory modeling. They struggle with noisy or missing data, suffer from gradient instability, and tend to produce geospatially homogeneous representations that overlook diverse behavioral patterns. These limitations reduce their robustness in environments characterized by irregular sampling, high noise levels, and complex vessel dynamics. Third, traditional real-valued encoding of AIS messages does not sufficiently preserve spatial–dynamic relationships between longitude, latitude, speed, and course. This reduces the model's ability to distinguish fine-grained behavioral deviations, particularly in constrained waterways such as the Red Sea, where small changes in movement can have significant operational implications. These gaps collectively highlight the need for a more stable, noise-tolerant, and spatially coherent representation, along with a recurrent architecture capable of capturing both short- and long-term dependencies in AIS trajectories under uncertainty.

To address these challenges, this study proposes the Deep Quality Vessel Trajectory Inspector (DQTI), a deep learning model developed specifically to improve AIS data quality and detect anomalies in the Red Sea region. The model is based on a Variational Recurrent Neural Network (VRNN) and leverages the complementary strengths of both GRU and LSTM architectures through a novel hybrid mechanism in which the hidden state is computed as a learnable weighted combination of the two units. This design enables the model to capture both short- and long-term dependencies more effectively while enhancing robustness to noise and missing data. Additionally, the model adopts a four-hot encoding technique that preserves spatial and dynamic relationships by discretizing longitude, latitude, speed over ground (SOG), and course over ground (COG). This representation eliminates the instability associated with clustering-based feature extraction and provides a consistent and informative structure for anomaly detection.

This study is based on a dataset collected from MarineTraffic, comprising more than 141,000 AIS records from 800 vessels over two months (July and November 2024), providing a rich database for analysis and evaluation. By detecting anomalies and addressing noise and incomplete data, the proposed model enhances the accuracy and reliability of AIS data, supporting safer and more efficient maritime operations. The findings also provide a foundation for future research aimed at improving AIS data quality across different maritime regions.The main contributions of this study are as follows:

A hybrid variational recurrent architecture that integrates a Variational Recurrent Neural Network (VRNN) with a learnable fusion of GRU and LSTM units, enabling stable latent modeling of noisy and irregular AIS trajectories.

A structured four-hot encoding scheme for longitude, latitude, speed over ground, and course over ground, allowing effective representation of multivariate maritime signals under varying spatial and temporal resolutions.

A reconstruction-based anomaly analysis framework that improves anomaly discrimination and reduces false-positive detections through chunk-level supervision and latent regularization, validated against GRU, LSTM, and baseline models.

Adaptive threshold optimization for anomaly detection using quantile-based estimation from training reconstruction error distribution, with empirical percentile selection via F1-score to reduce false positives and negatives.

The paper is structured as follows: Section 2 presents a review of related work. Section 3 details the dataset and methodology used in this study. Section 4 discusses the results and highlights the proposed model's performance. Finally, Section 5 provides the conclusion, summarizing the main findings of the proposed approach.

2 Related work

In recent years, scientists and researchers have focused on studying the quality of AIS data, due to its important role in maritime safety, navigation, and anomaly detection. Ensuring the quality of AIS data is critical to maritime safety, as poor-quality data, including tampering or spoofing, can confuse vessel tracking and increase the risk of accidents.¹¹ Another study highlighted the importance of continuous monitoring of AIS data, especially with the rise of autonomous shipping, which requires highly accurate and reliable data.¹² In addition, AIS data manipulation observed in December 2019 near the island of Elba, demonstrated the vulnerability of the system to cyberattacks, highlighting the need for data integrity in maritime operations.¹¹ Ship traffic off the coast of Portugal was studied and analyzed, with 1766 potential collisions identified over a 32-day period, highlighting the importance of quality AIS data in enhancing navigational safety.¹³ Through these studies and real-life cases, it is clear that maintaining accurate AIS data is essential for reducing maritime incidents and enhancing operational efficiency. AIS has become a primary source of vessel movement data, enabling a wide range of applications in maritime research and contributing to the digitalization of the maritime sector.¹⁴ Common issues in AIS data include noise, outliers, duplicates, and inconsistent or missing data.^7,15,16 Detecting anomalies in AIS data has gained attention due to its potential to improve safety and security.⁴ Numerous approaches have been proposed to address AIS data quality challenges, including visual analytics,¹⁵ machine learning techniques,¹⁷ and data quality control procedures.¹⁸

Traditional methods, such as rule-based approaches, provide interpretability, as they rely on pre-defined rules that are easy to understand and interpret. However, these methods face challenges in identifying comprehensive anomaly lists, as they rely on fixed rules that may not cover all possible scenarios. They also face difficulty handling relative terms such as “fast” or “slow,” whose interpretations may vary depending on the operational context or vessel type. These limitations reduce their effectiveness in operational settings. These limitations motivated the adoption of deep learning, which has become a preferred approach due to its high flexibility, scalability to process massive amounts of data, and ability to automatically detect anomaly patterns without the need for explicit rules, making it more suitable for modern maritime applications.¹⁹ Among deep learning techniques, Recurrent Neural Networks (RNNs) and their variants such as LSTM, GRU, and encoder-decoder architectures, have been widely applied to predict vessel trajectories and detect anomalies in AIS data.^20–23 While these models capture temporal dependencies effectively, they often face challenges in representing uncertainty, which reduces their ability to handle noisy and incomplete AIS data.^24,25

To overcome some of the limitations of RNNs, researchers have introduced Variational Recurrent Neural Networks (VRNNs), which combine probabilistic modeling with recurrent structures. This approach helps in capture complex patterns over time and handle uncertainty in multivariate time series.^26,27 VRNNs have been widely used in applications that involve sequential data, showing their ability to learn latent representations while reconstructing corrupted or missing data.²⁸ This makes them particularly useful for AIS data, where tracking gaps and inconsistencies are common. Several studies have applied VRNNs to detect anomalies in AIS data, taking advantage of their strength in modeling sequential relationships.^29,30 Transformer-based models, such as Transformer–VAE,³¹ have recently been applied to AIS trajectory analysis due to their ability to capture long-range dependencies through self-attention mechanisms. Unlike recurrent models, Transformers enable parallel processing and improved modeling of global contextual relationships across vessel trajectories. However, despite these advantages, Transformer-based approaches often rely on raw continuous feature representations, which may fail to preserve critical spatial–dynamic relationships between longitude, latitude, speed, and course. Moreover, they are sensitive to noise and irregular sampling, which are inherent characteristics of AIS data. These limitations can reduce their effectiveness in anomaly detection tasks, particularly in complex maritime environments. These limitations motivate the need for a framework that combines robust temporal modeling with structured feature representation, which is one of the main design principles of the proposed DQTI model.

In addition to deep generative models, feature encoding techniques play an essential role in enhancing data representation for machine learning tasks, such as anomaly detection or maritime route prediction. Feature encoding refers to the process of transforming raw data (such as longitude, latitude, speed, and course) into structured numerical representations that a model can understand and process effectively. Traditional encoding methods face significant challenges. They often fail to preserve spatial relationships (such as geographic proximity between points) or categorical relationships (such as vessel type or operational behavior), which can lead to the loss of important information and reduce the accuracy of models.

To overcome these challenges, recent studies have explored multi-hot encoding methods, including one-hot and four-hot encoding, to improve feature representations in trajectory modeling. The four-hot encoding captures dependencies across multiple feature dimensions, allowing models to extract more contextual information from AIS records.^19,32 By encoding longitude, latitude, speed over ground, and course over ground into distinct yet related feature spaces, this method has been shown to enhance data clustering and improve anomaly detection in vessel movement analysis. In line with the above review of the development of methods used in anomaly detection in maritime data, it is useful to provide a direct comparison between some of the prominent works discussed in this context to clarify the characteristics, key features, and shortcomings of each. The following table (Table 1) summarizes a group of these studies and highlights how our proposed model could address some of these challenges.

Table 1.
Comparison between previous studies and the proposed model.

Study Methodology used Strengths Weaknesses Addressing shortcomings

Kazemi et al., 2013³³ Explicit Rules Developed by Experts to define abnormal behavior. Easy to interpret, no need for huge data Not covering all scenarios and difficult to deal with relative terms Replace them with unsupervised learning to automatically detect anomaly patterns without relying on predefined rules.

Capobianco et al., 2021²⁰ Memory-based sequential models for representing temporal relationships. Good representation of temporal sequences, strong results in predicting trajectories Poor representation of uncertainty, affected by data errors and missing values Introduce generative models (such as VRNN) to represent the probability distribution of data.

Zhao and Shi, 2019²³ DBSCAN + RNN for trajectory Prediction Improves parameter selection using statistics Weak on short paths, sensitive to sample quality. Supports incomplete or short paths through a sequential probabilistic representation.

Clemmensen Kristoffer et al., 2021²⁹ Combining recurrent networks with probabilistic generative models. Dealing with missing or noisy data, learning latent representations Limited in modeling complex dynamic patterns. Used to address tracking gaps and inconsistencies in AIS data with high accuracy.

Nguyen et al., 2021¹⁹ Combining VRNN with four-hot representation. Latent geographic representation, generative model Model complexity, training challenges. The model takes inspiration from the idea and improves on it by hybridizing GRU and LSTM.

Hou et al., 2025³¹ Transformer with variational latent modeling on raw AIS features. Captures long-range dependencies and global contextual relationships Sensitive to noisy and irregular AIS sampling; raw features may not preserve spatial–dynamic structure Introduce a structured four-hot representation combined with a hybrid probabilistic recurrent framework (DQTI) to improve robustness against noise and better capture spatial–dynamic dependencies.

Study	Methodology used	Strengths	Weaknesses	Addressing shortcomings
Kazemi et al., 2013³³	Explicit Rules Developed by Experts to define abnormal behavior.	Easy to interpret, no need for huge data	Not covering all scenarios and difficult to deal with relative terms	Replace them with unsupervised learning to automatically detect anomaly patterns without relying on predefined rules.
Capobianco et al., 2021²⁰	Memory-based sequential models for representing temporal relationships.	Good representation of temporal sequences, strong results in predicting trajectories	Poor representation of uncertainty, affected by data errors and missing values	Introduce generative models (such as VRNN) to represent the probability distribution of data.
Zhao and Shi, 2019²³	DBSCAN + RNN for trajectory Prediction	Improves parameter selection using statistics	Weak on short paths, sensitive to sample quality.	Supports incomplete or short paths through a sequential probabilistic representation.
Clemmensen Kristoffer et al., 2021²⁹	Combining recurrent networks with probabilistic generative models.	Dealing with missing or noisy data, learning latent representations	Limited in modeling complex dynamic patterns.	Used to address tracking gaps and inconsistencies in AIS data with high accuracy.
Nguyen et al., 2021¹⁹	Combining VRNN with four-hot representation.	Latent geographic representation, generative model	Model complexity, training challenges.	The model takes inspiration from the idea and improves on it by hybridizing GRU and LSTM.
Hou et al., 2025³¹	Transformer with variational latent modeling on raw AIS features.	Captures long-range dependencies and global contextual relationships	Sensitive to noisy and irregular AIS sampling; raw features may not preserve spatial–dynamic structure	Introduce a structured four-hot representation combined with a hybrid probabilistic recurrent framework (DQTI) to improve robustness against noise and better capture spatial–dynamic dependencies.

Despite the significant progress achieved in AIS anomaly detection, several critical limitations remain. First, clustering-based methods are highly sensitive to parameter selection and often fail in complex and geographically constrained maritime environments. Second, recurrent models such as LSTM and GRU struggle to effectively capture uncertainty under noisy and incomplete AIS conditions, which limits their robustness. Third, Transformer-based approaches, while effective in modeling long-range dependencies, rely heavily on raw feature representations and are sensitive to irregular AIS sampling. Finally, traditional feature encoding methods often fail to preserve the underlying spatial–dynamic relationships between vessel movement attributes. These limitations collectively highlight the need for a robust and noise-tolerant framework that combines probabilistic temporal modeling with structured spatial–dynamic representation for AIS data quality enhancement and anomaly detection.

3 Methodology

To address the challenges of noisy and incomplete AIS data, this study proposes a deep learning-based anomaly detection model tailored for maritime trajectories. The proposed model combines GRU and LSTM architectures with four-hot encoding, to enhance the accuracy and reliability of anomaly detection. This method improves the identification of both spatial and dynamic anomalies in AIS data, ensuring more accurate and consistent results.

3.1 Data acquisition and preprocessing

This study is based on vessel data collected from MarineTraffic. This specialized and reliable commercial platform provides AIS data. In the first phase, the target geographic area—the Red Sea—was precisely defined using the platform's advanced geolocation tools. This step involved identifying the geographic coordinates representing the boundaries of the Red Sea region to ensure that the data was limited to this vital area with intense maritime activity. Next, a subscription was made to the paid MarineTraffic service to access high-resolution AIS data with full spatial and temporal attributes. The temporal scope of the dataset was defined to include two non-consecutive months (July and November 2024) in order to capture different operational conditions while maintaining a manageable and methodologically consistent dataset for trajectory-level anomaly detection. The vessel type and file format were carefully selected to ensure suitability for subsequent processing. This process required careful monitoring to ensure the completeness of the downloaded files and that no records were missing during the data acquisition stage. The data extracted from the platform is high-resolution dynamic-spatial data, including AIS message logs that continuously monitor the position and movement of vessels in the Red Sea. It included information on 800 vessels, including cargo, tankers, and passenger vessels. This dataset consists of more than 141,000 tracking messages, each representing a real-time observation of the vessel's status at a specific time. Each message contains a set of fields, the most important of which are: the timestamp of observations, latitude, longitude, SOG, COG, and a unique vessel identifier, as described in Table 2.

Table 2.
Key fields used to represent vessel tracking data

Key fields Description

Timestamp Specifies the exact time at which each AIS message was transmitted.

Longitude and Latitude Specify the vessel's instantaneous geographic position at the time the AIS message was sent.

Speed Over Ground (SOG) Represents the vessel's instantaneous speed relative to the Earth's surface at the transmission time.

Course Over Ground (COG) Indicates the vessel's instantaneous movement direction at the time of the AIS message.

Vessel ID (or MMSI) Allows each vessel to be tracked independently by linking sequential AIS messages over the monitoring period.

Key fields	Description
Timestamp	Specifies the exact time at which each AIS message was transmitted.
Longitude and Latitude	Specify the vessel's instantaneous geographic position at the time the AIS message was sent.
Speed Over Ground (SOG)	Represents the vessel's instantaneous speed relative to the Earth's surface at the transmission time.
Course Over Ground (COG)	Indicates the vessel's instantaneous movement direction at the time of the AIS message.
Vessel ID (or MMSI)	Allows each vessel to be tracked independently by linking sequential AIS messages over the monitoring period.

After the data collection process was completed, the raw data was pre-processed to ensure its quality and usability. Invalid or missing values, such as NaN or text placeholders such as “masked”, are removed. Fields containing numeric values are converted to floating-point format to prepare them for further analysis. All ship coordinates are verified to be within the Red Sea region by restricting the geographic coordinates to longitudes between 32° and 44°, and latitudes between 12° and 33°. This ensures that all ships in the dataset are in the Red Sea region. To ensure that the chronological order of messages or observations is maintained, the dataset is sorted by timestamp, and the index is reset. Unique ship identifiers are used to represent individual tracks. Table 3 shows a sample of the vessel data used, which contains vessel types, vessel ID (MMSI), timestamp, geographical coordinates that fall within the Red Sea boundaries, as well as the SOG and COG.

Table 3.

A sample of AIS data for vessel movements in the red sea area

Vessel name	Type	MMSI	Timestamp	Longitude	Latitude	SOG	COG
JASMINE	Cargo	511100266	01/11/2024 08:53:00 AM	39.11364	19.90281	11	161
NAVIS	Tanker	574004790	22/11/2024 06:58:00 AM	34.17316	27.39137	13.7	151
AE URANUS	Cargo	636022155	22/11/2024 01:39:00 PM	34.68066	26.75934	11.1	148
PRESTO GEE	Passenger	352002169	20/11/2024 09:19:00 PM	35.48533	25.84643	12.1	326

3.2 Multi-Dimensional data representation

After preprocessing, AIS messages are transformed using a four-hot encoding scheme, where longitude, latitude, speed over ground (SOG), and course over ground (COG) are discretized into predefined bins and encoded independently as binary vectors, as shown in Figure 1. The resulting vectors are then concatenated into a unified representation, providing a structured and noise-tolerant encoding of vessel trajectories. Unlike one-hot encoding, which represents a single categorical variable independently, the proposed four-hot encoding jointly represents four key AIS attributes—longitude, latitude, speed over ground (SOG), and course over ground (COG)—in a structured multi-dimensional form. This design allows the model to capture both spatial and dynamic dependencies simultaneously, rather than treating each feature in isolation. Compared with raw continuous inputs, this representation preserves spatial and dynamic relationships more effectively, enabling improved representation learning and anomaly detection performance.

Figure 1.

Transformation of input to four-hot encoding representation.

Following this representation, the encoded AIS trajectories are used as inputs to the proposed DQTI model for temporal modeling and anomaly detection. We evaluated two binning configurations to balance encoding resolution with computational efficiency. The standard configuration, the 260-bin configuration, consists of 100 bins for longitude, 100 bins for latitude, 50 bins for SOG, and 10 bins for COG. The second configuration, the 560-bin configuration, provides 200 bins for longitude, 300 bins for latitude, 50 bins for SOG, and 10 bins for COG. While the higher-resolution 560-bin configuration captures finer details, the 260-bin configuration demonstrated superior performance, particularly due to the geographical characteristics of the Red Sea. Being a relatively narrow waterway—approximately 355 kilometers wide at its widest point and narrowing to about 26–29 kilometers at the Bab-el-Mandeb Strait—the 260-bin configuration offers sufficient resolution for anomaly detection while maintaining computational efficiency. Experimental findings show that the standard configuration is more suitable for the Red Sea's narrow geographical structure, as it achieves an effective balance between spatial granularity and computational efficiency. To further enhance the model's ability to capture temporal dynamics, the vessel trajectory is segmented into fixed-length overlapping sequences using a sliding window approach. Given a trajectory:

X = {x_{1}, x_{2}, \dots, x_{N}}

Segments of length T = 10 are generated with a stride of 1, resulting in multiple overlapping subsequences. In this context, the sequence length T defines the Chunk Size, which is set to 10 time steps, as shown in Figure 2. Each segment (or chunk) is then transformed using the four-hot encoding scheme, producing a tensor of shape (T, D), where D denotes the encoding dimension (D = 260).

Figure 2.

Sliding window-based trajectory segmentation.

3.3 DQTI model structure

Figure 3 illustrates the architecture of the proposed model based on the VRNN. VRNNs are an effective approach in handling time series data of ship trajectories, making them suitable for capturing vessel movement behavior and detecting anomalies. This approach enhances traditional LSTM and GRU models by incorporating probabilistic components. These additions improve the ability of the model to represent complex distributions and predict anomalies more accurately.

Figure 3.

VRNN-Based anomaly detection process flow.

In this model, the encoder learns the probabilistic distribution of the latent variable $z_{t}$ from the observed data $x_{t}$ . This distribution is determined by two values: the mean $μ_{t}$ and the variance $σ_{t}^{2}$ . These two values define how the data is represented in the latent space. The probabilistic distribution is expressed as:

q (z_{t} | x_{t}) = N (z_{t} | μ_{t}, σ_{t}^{2})

(1)

Where $q (z_{t} | x_{t})$ represents the posterior distribution of the latent variable $z_{t}$ , conditioned on the input data $x_{t}$ . To make the model differentiable during training, the reparameterization trick is used. This technique allows backpropagation to flow through the latent space by expressing $z_{t}$ as:

z_{t} = μ_{t} + σ_{t} ⊙ ε_{t} w h e r e ε_{t} \sim N (0, 1)

(2)

Here, $ε_{t}$ is noise sampled from a standard normal distribution. The latent variable $z_{t}$ is computed by combining the mean $μ_{t}$ , the standard deviation $σ_{t}$ , and the noise. This method ensures that the latent variable $z_{t}$ captures the underlying patterns of the input data effectively, allowing the model to learn a structured probabilistic distribution. The prior distribution p( $z_{t}$ | $h_{t - 1}$ ) is introduced to model the distribution of the latent variable without depending on the input data $x_{t}$ . Instead, it depends on the hidden state $h_{t}$ . This hidden state is produced by the recurrent layers of the model (such as LSTM or GRU). The prior is set as a standard normal distribution:

p (z_{t}) = N (z_{t} | 0, 1)

(3)

This distribution ensures that the model can generate new data and predict temporal patterns using only the hidden state, without relying on the input data directly. The model maintains a probabilistic approach to latent variable generation while leveraging recurrent layers to capture temporal dependencies. The decoder reconstructs the input data $x_{t}$ from the latent variable $z_{t}$ and the hidden state $h_{t}$ . This process is represented as a probability distribution:

p (x_{t} | z_{t}, h_{t - 1}) = N ({\hat{x}}_{t}, \emptyset_{z} (z_{t}))

(4)

Here, ${\hat{x}}_{t}$ represents the reconstructed data, and $\emptyset_{z} (z_{t}$ ) refers to an intermediate feature representation obtained from the latent variable $z_{t}$ through a neural transformation ReLU( $z_{t}$ + $b_{z}$ ). However, this is not the final output of the decoder. The full decoder typically maps $z_{t}$ to the parameters of a Gaussian distribution—namely, the mean $μ_{t}$ and variance $σ_{t}^{2}$ —which together define the conditional distribution $p (x_{t} | z_{t}, h_{t - 1})$ . The reconstruction loss is computed using binary cross-entropy, which is well suited for the four-hot encoded representation of the AIS data, to measure how closely the reconstructed data matches the original input. The reconstruction loss is given by:

L_{t} r e c o n = - \sum_{i = 1}^{N} (x_{t, i} \log ({\hat{x}}_{t, i}) + (1 - x_{t, i}) \log (1 - {\hat{x}}_{t, i}))

(5)

This loss measures how the model can reconstruct the input data given the latent variable and hidden state. Also, encourages the model to learn meaningful representations in the latent space. In the DQTI model, the hidden state $h_{t}$ is updated by combining the outputs of two different recurrent units: GRU and LSTM. First, the hidden state is computed by the GRU unit and then by the LSTM unit. The results from both units are combined using a learnable parameter α. This parameter controls the weight of each unit's influence on the final update. The equation for the hidden state update is as follows:

\begin{aligned} h_{t} = α . h_{g r u} + (1 - α) . h_{l s t m} \end{aligned}

(6)

Where $h_{g r u}$ is the hidden state produced by the GRU unit, $h_{l s t m}$ is the hidden state generated by the LSTM unit. The parameter α is initialized to 0.5 and treated as a global learnable scalar shared across all time steps. It is optimized during training via backpropagation, allowing the model to dynamically balance the contributions of the GRU and LSTM units. GRU can capture short-term dependencies, and LSTM is effective in handling long-term dependencies. The KL divergence is used to measure the difference between the posterior distribution q(zt|xt) and the prior distribution p(zt|ht). KL divergence measures how closely the model's learned latent distribution aligns with the distribution implied by the observed AIS data. In this VRNN-based setting, a lower KL value typically reflects a more stable and well-regularized latent representation of normal vessel behavior, which helps establish a reliable baseline for anomaly detection. Conversely, higher KL values indicate increased uncertainty or mismatch in the latent space, which can reduce the consistency of reconstruction and make anomaly scoring less reliable if the model fails to learn normal dynamics adequately. It is a critical component in the variational autoencoder (VAE) framework, ensuring that the latent space is well-structured and preventing overfitting. KL divergence is expressed as:

L_{K L} = D_{K L} [q (z_{t} | x_{t}) | | p (z_{t} | h_{t - 1})]

(7)

The formula for KL divergence between two Gaussian distributions $N (μ_{t}, σ_{t}^{2})$ and $N (μ_{p, t}, σ_{p, t}^{2})$ is:

D_{K L} (N (μ_{t}, σ_{t}^{2}) | | N (μ_{p, t}, σ_{p, t}^{2})) = \log \frac{σ_{p, t}}{σ_{t}} + \frac{σ_{t}^{2} + {(μ_{t} - μ_{p, t})}^{2}}{2 σ_{p, t}^{2}} - \frac{1}{2}

(8)

Where $μ_{t}$ (posterior mean), $σ_{t}^{2}$ (posterior variance), $μ_{p, t}$ (prior mean), and $, σ_{p, t}^{2}$ (prior variance). This term measures the difference between the posterior and prior distributions, encouraging the model to align them more closely. It plays an important role in ensuring the latent space is structured and well-regularized. The total loss function in the model is calculated as the sum of the reconstruction loss and the KL divergence loss. A weighting factor $β$ is introduced to control the relative importance of the KL divergence term:

L = L_{r e c o n} + β . L_{K L}

(9)

Where $L_{r e c o n}$ represents the reconstruction loss, $L_{K L}$ is the KL divergence loss. The $β$ is a hyperparameter used to control the trade-off between these two terms. This formulation allows the model to balance the importance of reconstructing the data accurately and ensuring a structured latent space. The proposed model operates within a variational recurrent framework, where trajectory reconstruction serves as the primary learning objective. Since the model is trained to reconstruct normal vessel behavior, anomalous trajectories result in larger reconstruction errors, which are used to quantify deviations from expected patterns. Anomalies are identified based on reconstruction error, where deviations between observed and reconstructed trajectories are quantified using the anomaly score defined in Eq. (10).

A n o m a l y S c o r e (x_{t}) = ‖ x_{t} - {\hat{x}}_{t} ‖

(10)

After calculating the Anomaly Score, the Anomaly Rate can be determined, showing the percentage of time points that contain anomalies in the trajectories. This is calculated by comparing the difference between the actual and reconstructed positions of the ship to a specific threshold, using the following equation:

A n o m a l y R a t e = \frac{\sum_{t = 1}^{T} 1 (‖ x_{t} - {\hat{x}}_{t} ‖ > t h r e s h o l d)}{T}

(11)

Where T is the total number of time points. This formulation provides a quantitative measure of abnormal behavior in vessel trajectories by capturing deviations from learned normal patterns.

3.4 Anomaly detection approach

Given the anomaly score defined as the Euclidean distance in Eq. (10), anomaly detection is performed by comparing the reconstruction error against a predefined threshold. This threshold determines whether a given trajectory point is considered normal or anomalous. The anomaly threshold is estimated using a quantile-based approach derived from the distribution of reconstruction errors. Specifically, a percentile-based cutoff is selected to distinguish between normal and anomalous trajectory points. The optimal percentile is selected empirically using performance metrics such as the F1-score, ensuring a balance between false positives and false negatives. This results in a robust and adaptive anomaly detection mechanism that adjusts to the underlying data distribution.

Baseline models for comparison

For comparative evaluation, two benchmark models were selected. The first is Transformer–VAE,³¹ which integrates a Transformer architecture with a Variational Autoencoder and is trained on raw feature values. The second is GeoTrackNet,¹⁹ a state-of-the-art VRNN-based model that utilizes four-hot encoding for maritime trajectory representation. To ensure fairness in comparison, consistent hyperparameters were maintained across all models, including a batch size of 32, a learning rate of 1e − 3 using the AdamW optimizer, and 200 training epochs.

3.6 Experimental evaluation

To evaluate the proposed model's performance, an experimental dataset was constructed by injecting synthetic anomalies and noise into 30% of the total trajectories. These anomalies were generated by perturbing key AIS attributes, including spatial position, speed over ground (SOG), and course over ground (COG), to simulate realistic abnormal vessel behaviors such as route deviation, abrupt speed changes, and erratic heading variations. This process provides a controlled benchmark for assessing detection capability. All models were trained using the AdamW optimizer, with trajectories segmented into chunks of ten time steps. Quantitative evaluation was conducted using multiple metrics, including reconstruction loss, KL divergence, total loss, anomaly rate, precision, recall, and F1-score. These metrics collectively assess each model's ability to accurately detect anomalous trajectories while preserving reliable reconstruction of normal vessel behavior. Model performance was continuously monitored throughout the training process, and comparative analysis was performed across baseline models and encoding configurations to evaluate robustness and scalability.

4 Experiment and results

This section presents the performance results for a proposed high-quality vessel route inspection DQTI model designed to detect anomalies in vessels’ trajectories. We compare the proposed models (DQTI-GRU, DQTI-LSTM, and DQTI-Hybrid) with two recent benchmark models: Transformer-VAE and GeoTrackNet. The evaluation focuses on trajectory reconstruction accuracy, latent representation quality, anomaly detection performance, and scalability.

4.1 Dataset and model parameters

This part describes the AIS trajectory dataset used for training, validation, and testing, all sourced from the Red Sea. As detailed in Section 3.1, the dataset comprises approximately 141,000 AIS messages from 800 vessels, collected during July and November 2024. The dataset underwent a series of preprocessing steps, including noise filtering, removal of stationary vessels (speed ¡ 1 knot), trajectory segmentation, and, for relevant models, four-hot encoding. The four-hot encoding scheme, which discretizes Longitude, Latitude, Speed Over Ground (SOG), and Course Over Ground (COG), was applied at two distinct resolutions to evaluate its scalability: a standard resolution (260-bin) and a high resolution (560-bin). The Transformer–VAE model was trained on the raw (unencoded) values of these four features.

Table 4 summarizes the key characteristics of the preprocessed dataset. The data was partitioned chronologically to prevent data leakage, with 70% used for training, 15% for validation, and the remaining 15% for testing. All models were trained using a chunking strategy, where trajectories were divided into smaller sequences (chunks) of 10 time steps to effectively handle irregularities and inconsistencies. Consistent hyperparameters were used to ensure a fair comparison: a batch size of 32, an initial learning rate of 1e-3 with the AdamW optimizer, and a maximum of 200 training epochs

Table 4.
Summary of the preprocessed AIS trajectory dataset from the red sea.

Feature 260-bin configuration 560-bin configuration

Total Vessels 800

Total AIS Messages ∼141,000

Training/Validation/Test Split 70% / 15% / 15%

Encoding Dimensions 260 (Lon:100, Lat:100, SOG:50, COG:10) 560 (Lon:200, Lat:300, SOG:50, COG:10)

Seq length 60 msg

Chunk Size 10 time steps 10 time steps

Feature	260-bin configuration	560-bin configuration
Total Vessels	800
Total AIS Messages	∼141,000
Training/Validation/Test Split	70% / 15% / 15%
Encoding Dimensions	260 (Lon:100, Lat:100, SOG:50, COG:10)	560 (Lon:200, Lat:300, SOG:50, COG:10)
Seq length	60 msg
Chunk Size	10 time steps	10 time steps

Figure 4.

Training loss curves for all models over 200 epochs.

4.2 Training performance comparison

This subsection compares the learning efficiency and stability of all models during the training phase. Figure 4 illustrates the training loss curves over 200 epochs, providing insight into each model's convergence behavior (using 260-bin configuration for the four-hot based models). The Transformer–VAE model exhibits slower convergence compared to the DQTI variants, stabilizing after approximately 150 epochs with some noticeable fluctuations. GeoTrackNet shows a significantly higher loss and much slower convergence, struggling to learn effectively from the four-hot encoded data. Among the recurrent models, DQTI-GRU and DQTI-LSTM show steady improvement but with noticeable fluctuations. The proposed DQTI-Hybrid model clearly outperforms all others, achieving the fastest convergence and the lowest final training loss. Its curve descends sharply and stabilizes with minimal oscillation, indicating superior training efficiency and robustness.

Table 5 provides a quantitative summary of the training performance, confirming the visual observations from Figure 4. The DQTI-Hybrid model achieves the lowest final training loss (2.69), reconstruction loss (2.66), and KL divergence (0.03). The low KL divergence indicates a stable and well-regularized latent representation of normal vessel behavior. This represents a significant improvement over the baselines, outperforming Trans- former–VAE which achieves a loss of 4.80, and vastly exceeding GeoTrackNet which exhibits an excessively high loss (134.22).

Table 5.
Comprehensive training performance comparison for all models (using 260-bin configuration for four-hot models).

Model Total loss Reconstruction loss KL divergence Anomaly rate

Transformer–VAE 4.80 4.75 0.05 3.5%

GeoTrackNet [19] 134.22 – 0.07 49.0%

DQTI-GRU 9.33 8.60 0.72 5.0%

DQTI-LSTM 6.83 6.73 0.10 1.0%

DQTI-Hybrid (Ours) 2.69 2.66 0.03 0.17%

Model	Total loss	Reconstruction loss	KL divergence	Anomaly rate
Transformer–VAE	4.80	4.75	0.05	3.5%
GeoTrackNet [19]	134.22	–	0.07	49.0%
DQTI-GRU	9.33	8.60	0.72	5.0%
DQTI-LSTM	6.83	6.73	0.10	1.0%
DQTI-Hybrid (Ours)	2.69	2.66	0.03	0.17%

Figure 5.

Visual comparison of reconstruction performance. The DQTI-Hybrid model demonstrates the highest fidelity in capturing fine-grained movement patterns compared to the smoother, less accurate baselines.

4.3 Trajectory reconstruction performance

Having confirmed training efficiency, we proceed to evaluate the models’ ability to accurately reconstruct vessel trajectories. This capability is fundamental, as our model operates on the principle that a normal trajectory can be reconstructed with low error, while an anomalous trajectory will yield a high reconstruction error. Figure 5 presents a visual comparison between a ground truth trajectory segment (in blue) and its reconstructions from each model (in orange) for the four primary features.

It is clearly observed that the Transformer–VAE model reconstructs the trajectory in an overly smooth manner and fails to capture fine details and sharp transitions, particularly in speed and course profiles. This suggests that raw data representation may be insufficient for learning the complex dynamics of vessel motion. GeoTrackNet slightly improves the result but still struggles with abrupt changes. In contrast, all DQTI models, and especially the hybrid DQTI-Hybrid model, demonstrate a superior ability to follow the original trajectory with high fidelity. Our proposed model successfully reconstructs changes in speed and course almost identically to the original, indicating that it has effectively learned the natural motion patterns.

These visual observations are supported by the quantitative results in Table 6, where we computed the Mean Squared Error (MSE) for reconstruction. The DQTI-Hybrid achieves the lowest error (0.021), significantly outperforming Transformer–VAE (0.058) and GeoTrackNet (0.046). This confirms the hybrid model's capacity to learn and reproduce the complex spatiotemporal patterns in AIS data with high efficiency.

Table 6.
Quantitative trajectory reconstruction performance (260-bin configuration for relevant models)

Model Reconstruction loss (MSE)

Transformer–VAE 0.058

GeoTrackNet [19] 0.046

DQTI-GRU 0.032

DQTI-LSTM 0.027

DQTI-Hybrid (Ours) 0 . 021

Model	Reconstruction loss (MSE)
Transformer–VAE	0.058
GeoTrackNet [19]	0.046
DQTI-GRU	0.032
DQTI-LSTM	0.027
DQTI-Hybrid (Ours)	0 . 021

4.4 Impact of encoding resolution (multi-dimensional encoding analysis)

Having established DQTI-Hybrid as the optimal architecture, we analyze the impact of four-hot encoding resolution on its performance. Table 7 compares the model's performance using the standard (260-bin) and high-resolution (560-bin) encoding schemes. Although the high-resolution encoding offers greater granularity, it introduces significant challenges. The total loss of the DQTI-Hybrid model increases from 2.69 to 5.19, and the KL divergence rises from 0.03 to 0.09, indicating a less stable latent space. More importantly, the anomaly detection rate during testing increases from a very low 0.5% to 1.1%. This suggests that the 560-dim encoding may capture fine-grained noise as salient features, leading to a slightly higher false positive rate without a corresponding gain in detecting genuine anomalies. The 260-bin configuration provides the best balance between accuracy, computational efficiency, and a practical anomaly detection rate for the geographical characteristics of the Red Sea.

Table 7.
Performance comparison of DQTI-hybrid with different encoding resolutions.

Encoding resolution Total loss Reconstruction loss KL divergence Anomaly rate (Test)

Standard (260-bin) 2.69 2.66 0.03 0.5%

High-Resolution (560-bin) 5.19 5.10 0.09 1.1%

Encoding resolution	Total loss	Reconstruction loss	KL divergence	Anomaly rate (Test)
Standard (260-bin)	2.69	2.66	0.03	0.5%
High-Resolution (560-bin)	5.19	5.10	0.09	1.1%

Figure 6 plots the reconstruction loss per epoch for both encoding resolutions. The 560-bin configuration shows higher loss throughout training with more pronounced fluctuations. The standard 260-bin configuration provides a smoother and more stable training curve, confirming its suitability for this application.

Figure 6.

Reconstruction loss for both encoding resolutions.

4.5 Anomaly detection threshold optimization

A critical step in anomaly detection is selecting an optimal threshold to distinguish between normal and anomalous behavior. We analyze the distribution of reconstruction errors on the training and test sets using the DQTI-Hybrid model with the standard 260-bin configuration. Figure 7 shows the distribution of these errors. The training set errors (representing normal behavior) are tightly clustered around low values, while the test set exhibits a long tail, indicating the presence of anomalous trajectories with high reconstruction errors. This separation motivates the use of a percentile-based threshold on the training set errors.

Figure 7.

Distribution of reconstruction errors for training and test sets. The training set (blue) represents normal behavior and is tightly clustered. The test set (orange) shows a long tail, suggesting the presence of anomalous samples with high reconstruction error.

We evaluate three candidate thresholds corresponding to the 90th, 95th, and 98th percentiles of the training error distribution. Table 8 presents the precision, recall, and F1-score for anomaly detection at each threshold. The 98th percentile yields high precision but very low recall, missing many true anomalies. The 90th percentile captures more anomalies but at the cost of significantly lower precision (more false positives). The 95th percentile provides the best trade-off, achieving the highest F1-score (0.92) and is therefore selected as the optimal threshold for all subsequent experiments.

Table 8.

Anomaly detection performance for DQTI-hybrid at different thresholds (260-bin configuration).

Threshold (Percentile)	Precision	Recall	F1-Score
90th	0.81	0.98	0.89
95th	0.94	0.91	0.92
98th	0.98	0.70	0.82

4.6 Cross-Model anomaly detection comparison

Using the optimal 95th percentile threshold, we compare the anomaly detection performance of all models on the test set. Figure 8 shows the anomaly detection error rate over training epochs, demonstrating that DQTI-Hybrid quickly establishes a low and stable error rate compared to other models.

Figure 8.

Anomaly detection error rate under the 260-bin configuration.

Table 9 provides a comprehensive comparison using the standard 260-bin configuration for models that utilize it. The DQTI-Hybrid model achieves the highest scores across all metrics, with a Precision of 0.94, Recall of 0.91, and F1-Score of 0.92. This represents a substantial improvement over the baselines, particularly in recall, indicating its superior ability to correctly identify true anomalous trajectories. The proposed model clearly outperforms Transformer–VAE, which achieves an F1-score of 0.68, confirming the effectiveness of the four-hot representation and the proposed hybrid architecture.

Table 9.

Cross-Model anomaly detection comparison (260-bin configuration for relevant models, 95th percentile threshold).

Model	Precision	Recall	F1-Score
Transformer–VAE	0.71	0.65	0.68
GeoTrackNet [19]	0.44	0.58	0.50
DQTI-GRU	0.77	0.81	0.79
DQTI-LSTM	0.84	0.86	0.85
DQTI-Hybrid (Ours)	0.94	0.91	0.92

Figure 9 presents the confusion matrices for all models, visually demonstrating the superior performance of the DQTI-Hybrid, which correctly classifies the vast majority of both normal and anomalous trajectories. It can be observed that Transformer–VAE fails to identify a considerable number of anomalies (high false negatives), while GeoTrackNet suffers from an excessive number of false positives.

Figure 9.

Confusion matrices for all models.

4.7 Analysis of detected anomalies

A qualitative analysis was performed on the 34 anomalous trajectories successfully detected by the DQTI-Hybrid model (from the test set using the 260-bin configuration). These were categorized into three primary behavioral patterns, as summarized in Table 10. The model proved highly effective in detecting various forms of genuine irregular patterns. Figure 10 provides a visual example for each anomaly type, illustrating the model's ability to pinpoint the exact location and nature of the deviant behavior.

Figure 10.

Visual examples of detected anomalies: (a) spatial deviation, (b) atypical speed, (c) course disruption. The model successfully isolates segments (highlighted in red) that deviate from learned normal patterns.

Table 10.

Categorization of anomalies detected by the DQTI-hybrid model.

Anomaly type	Count	Description
Spatial Deviation	18	Trajectories deviating from established shipping lanes.
Atypical Speed	9	Sudden, unrealistic accelerations or speeds outside the range.
Course Disruption	7	Erratic and unstable heading (COG) changes.

Figure 11 and Figure 12 illustrate a comparison between the representation of normal and anomalous paths using the DQTI-Hybrid model. The normal path (Figure 11) exhibits high agreement between the original path and the reconstruction, with a stable and coherent latent representation. In contrast, the anomalous path (Figure 12) shows clear deviations and a high reconstruction error, with points scattered in the latent space outside the normal range. These results confirm the model's ability to effectively distinguish between normal and abnormal behavior based on the properties of the reconstruction and the latent representation.

Figure 11.

DQTI-Hybrid representation of a normal trajectory.

Figure 12.

DQTI-Hybrid representation of an anomalous trajectory.

4.8 Summary of results

The experimental results demonstrate that the proposed DQTI-Hybrid model provides a more efficient, stable, and reliable solution for reconstruction-based anomaly detection and AIS data quality assessment than all baseline models, including the Transformer–VAE which relies on raw inputs and the VRNN-based GeoTrackNet. Its superior performance is attributed to several complementary design choices: the hybrid recurrent architecture enhances temporal modeling across varying time scales, explicit reconstruction supervision strengthens the reliability of the learned normal-behavior baseline, the chunking strategy improves training stability on irregular AIS streams, and the four-hot encoding proves superior to raw feature representation in capturing spatial and dynamic relationships. The 260-bin four-hot encoding scheme, combined with the DQTI-Hybrid architecture and the 95th percentile anomaly threshold, offers the best balance between reconstruction accuracy, latent stability, and a practical, high-precision anomaly detection rate for the complex maritime environment of the Red Sea.

5 Conclusion

This study proposed DQTI, a hybrid VRNN-based framework for AIS data quality enhancement and anomaly detection in vessel trajectories. By combining a learnable GRU–LSTM fusion mechanism with four-hot encoding and reconstruction-based anomaly analysis, the proposed model achieved superior performance compared with benchmark models, including Transformer–VAE and GeoTrackNet. Experimental results on more than 141,000 AIS records from 800 vessels in the Red Sea showed that DQTI-Hybrid achieved the lowest total loss (2.69), the lowest KL divergence (0.03), and the highest anomaly detection performance with a precision of 0.94, recall of 0.91, and F1-score of 0.92. These findings confirm that the proposed framework provides a stable latent representation and more reliable separation between normal and anomalous vessel behavior. In addition, the 260-bin four-hot configuration provided the best balance between spatial resolution, latent stability, and practical anomaly detection performance, making it more suitable than the higher-resolution 560-bin configuration for the geographical characteristics of the Red Sea. The results also demonstrate that the reduction in anomaly rate reflects improved discrimination capability rather than suppression of abnormal behavior, which strengthens the practical relevance of the model for maritime monitoring applications. Although the present study focused on the Red Sea, the proposed framework can be adapted to other maritime regions. Future work will investigate cross-region validation, real-time anomaly monitoring, and the integration of contextual variables such as weather, sea state, and traffic density to further improve robustness under diverse navigational conditions.

Footnotes

Acknowledgements

None.

Ethical considerations

Not applicable. This study does not involve human participants or animals and relies solely on publicly available AIS vessel trajectory data.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Author contributions

The authors confirm contribution to the paper as follows: Conceptualization, Jawaher Alqahtani, Ayman Yafoz, and Mohamed El-Eliemy; methodology, Jawaher Alqahtani and Mohamed El-Eliemy; software, Jawaher Alqahtani; validation, Jawaher Alqahtani, Ayman Yafoz, and Mohamed El-Eliemy; formal analysis, Jawaher Alqahtani; investigation, Jawaher Alqahtani, Ayman Yafoz, and Mohamed El-Eliemy; resources, Jawaher Alqahtani; data curation, Jawaher Alqahtani; writing—original draft preparation, Jawaher Alqahtani; writing—review and editing, Jawaher Alqahtani, Ayman Yafoz, and Mohamed Hamdy El-Eliemy; visualization, Jawaher Alqahtani; supervision, Ayman Yafoz and Mohamed El-Eliemy; project administration, Ayman Yafoz and Mohamed El-Eliemy. All authors have read and approved the final version of the manuscript.

Funding

The project was funded by KAU Endowment (WAQF) at king Abdulaziz University, Jeddah, Saudi Arabia. The authors, therefore, acknowledge with thanks WAQF and the Deanship of Scientific Research (DSR) for technical and financial support.

Funding Statement: The authors acknowledge with thanks WAQF and the Deanship of Scientific Research at King Abdulaziz University for funding this work.

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Data availability

The data used in this study are available from the corresponding author upon reasonable request.

Availability of data and materials

The data that support the findings of this study are available from the Corresponding Author, J.A., upon reasonable request.

ORCID iDs

Jawaher Alqahtani

Mohamed Hamdy El-Eliemy

References

Węcel

Stróżyna

Szmydt

, et al. The impact of crises on maritime traffic: a case study of the COVID-19 pandemic and the war in Ukraine. Netw Spat Econ 2024; 24: 199–230. Springer: 199–230.

Afifi

. The importance and components of economic cooperation among Red Sea countries. Journal of Economics, Management and Trade 2024; 30: 29–33. Sciencedomain International: 29–33.

Farah

. Shifting tides amidst regional challenges: navigating horn of Africa’s geopolitical chessboard—literature review. Open J Soc Sci 2024; 12: 70–83. Scientific Research Publishing, Inc.: 70–83.

Wolsing

Roepert

Bauer

, et al. Anomaly Detection in Maritime AIS Tracks: A Review of Recent Approaches. Journal of Marine Science and Engineering 2022; 10: 12. MDPI.

Liang

Liu

, et al. AISClean: AIS data-driven vessel trajectory reconstruction under uncertain conditions. Ocean Eng 2024; 306, Elsevier Ltd.

Zhang

Ren

, et al. Incorporation of Deep Kernel Convolution into Density Clustering for Shipping AIS Data Denoising and Reconstruction. Journal of Marine Science and Engineering 2022; 10. MDPI.

Stróżyna

Filipiak

Węcel

. Data quality assessment – A use case from the maritime domain. In: Lecture notes in business information processing 394. Cham: Springer, 2026, pp.5–20.

Xiao

, et al. AIS Data analytics for intelligent maritime surveillance systems. Cham: Springer, 2021, pp.393–411.

Rovinelli

Rocchesso

Simeoni

, et al. Spatiotemporal characterisation of underwater noise through semantic trajectories. GeoInformatica 2025; 29: 845–876. Springer: 845–876.

10.

Larayedh

Cornuelle

Krokos

, et al. Numerical investigation of shipping noise in the Red Sea. Sci Reports 2024; 14: 5851. Nature Research: 5851.

11.

Androjna

Perkovič

Pavic

, et al. Ais data vulnerability indicated by a spoofing case-study. Applied Sciences (Switzerland) 2021; 11,(MDPI AG.

12.

Anuoluwapo

. Quality Assessment of Maritime AIS Data Title: Quality Assessment of Maritime AIS Data . 2023.

13.

Silveira

PAM

Teixeira

Soares

. Use of AIS data to characterise marine traffic patterns and ship collision risk off the coast of Portugal. J Navig 2013; 66: 879–898.

14.

Yang

Wang

, et al. How big data enriches maritime research – a critical review of automatic identification system (AIS) data applications. Transport Reviews 2019; 39: 755–773. Routledge: 755–773.

15.

Lei

Chu

, et al. A visual analysis approach to understand and explore quality problems of AIS data. Journal of Marine Science and Engineering 2021; 9: 1–18. MDPI AG: 1–18.

16.

Mekkaoui

Berrado

Benabbou

. Automatic Identification System Data Quality: Outliers Detection Case. Epub ahead of print March 2022. 2022.

17.

Yang

, et al. Harnessing the power of Machine learning for AIS Data-Driven maritime Research: A comprehensive review. Transp Res Part E Logist Transp Rev 2024; 183, Elsevier Ltd.

18.

Chen

Ling

Yang

, et al. Ship Trajectory Reconstruction from AIS Sensory Data via Data Quality Control and Prediction. In: Mathematical Problems in Engineering 2020. Hindawi Limited, 2020.

19.

Nguyen

Vadaine

Hajduch

, et al. GeoTrackNet-A Maritime Anomaly Detector using Probabilistic Neural Network Representation of AIS Tracks and A Contrario Detection. IEEE Trans Intell Transp Syst 2021; 23. Epub ahead of print 2021. DOI: 10.1109/TITS.2021.3055614ï.

20.

Capobianco

Millefiori

Forti

, et al. Deep Learning Methods for Vessel Trajectory Prediction based on Recurrent Neural Networks. Epub ahead of print 7 January 2021. DOI: 10.1109/TAES.2021.3096873. 2021.

21.

Forti

Millefiori

Braca

, et al. Prediction oof vessel trajectories from AIS data via sequence-to-sequence recurrent neural networks. In: ICASSP 2020 - 2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), May 2020. Piscataway, NJ: IEEE, 2020, pp.8936–8940.

22.

Zhang

Kujala

Musharraf

, et al. A machine learning method for the prediction of ship motion trajectories in real operational conditions. Ocean Eng 2023; 283, Elsevier Ltd.

23.

Zhao

Shi

. Maritime anomaly detection using density-based clustering and recurrent neural network. J Navig 2019; 72: 894–916.

24.

Billah

Zhang

. A method for vessel’s trajectory prediction based on encoder decoder architecture. Journal of Marine Science and Engineering 2022; 10: 1529.

25.

Zhang

Hirayama

Ren

, et al. Ship anomalous behavior detection using clustering and deep recurrent neural network. Journal of Marine Science and Engineering 2023; 11: 763.

26.

Kieu

Yang

Guo

, et al. Anomaly detection in time series with robust variational quasi-recurrent autoencoders. In: 2022 IEEE 38th international conference on data engineering (ICDE), May 2022. Piscataway, NJ: IEEE, 2022, pp.1342–1354.

27.

Lin

Clark

Birke

, et al. Anomaly detection for time series using VAE-LSTM hybrid model. In: ICASSP 2020 - 2020 IEEE international conference on acoustics, speech and signal processing (ICASSP), May 2020. Piscataway, NJ: IEEE, 2020, pp.4322–4326.

28.

Iqbal

Qureshi

. Reconstruction probability-based anomaly detection using variational auto-encoders. Int J Comput Appl 2023; 45: 231–237.

29.

Clemmensen

KLKH

Olesen

Hørlück

. Anomaly Detection for AIS Data Using Deep Neural Networks for Trajectory Predictions Marie Normann Gadeberg supervised by. Available at: https://www.compute.dtu.dk/english. 2021.

30.

Xia

Gao

. Analysis of vessel anomalous behavior based on Bayesian recurrent neural network.. . In: 2020 IEEE 5th international conference on cloud computing and big data analytics (ICCCBDA), April 2020. Piscataway, NJ: IEEE, 2020, pp.393–397.

31.

Hou

Zhou

Grifoll

, et al. A transformer–VAE approach for detecting ship trajectory anomalies in cross-sea bridge areas. Journal of Marine Science and Engineering 2025; 13: 849. 2025, Vol. 13, Page 849 13(5). Multidisciplinary Digital Publishing Institute: 849.

32.

Takahashi

Zama

Hiroi

. Ship trajectory prediction using AIS data with TransFormer-based AI2024 IEEE conference on artificial intelligence (CAI). Piscataway, NJ: IEEE Computer Society, 2024, pp.1302–1305.

33.

Kazemi

Abghari

Lavesson

, et al. Open data for anomaly detection in maritime surveillance. Expert Syst Appl 2013; 40: 5719–5729. Pergamon: 5719–5729.

A hybrid variational framework for AIS data quality enhancement and anomaly detection

Abstract

Keywords

1 Introduction

2 Related work

3.1 Data acquisition and preprocessing

Baseline models for comparison

3.6 Experimental evaluation

4 Experiment and results

4.1 Dataset and model parameters

Table 6. Quantitative trajectory reconstruction performance (260-bin configuration for relevant models) Model Reconstruction loss (MSE) Transformer–VAE 0.058 GeoTrackNet [19] 0.046 DQTI-GRU 0.032 DQTI-LSTM 0.027 DQTI-Hybrid (Ours) 0 . 021

Table 7. Performance comparison of DQTI-hybrid with different encoding resolutions. Encoding resolution Total loss Reconstruction loss KL divergence Anomaly rate (Test) Standard (260-bin) 2.69 2.66 0.03 0.5% High-Resolution (560-bin) 5.19 5.10 0.09 1.1%

5 Conclusion

Footnotes

Acknowledgements

Ethical considerations

Consent to participate

Consent for publication

Author contributions

Funding

Declaration of conflicting interests

Data availability

Availability of data and materials

ORCID iDs

References

Table 6.
Quantitative trajectory reconstruction performance (260-bin configuration for relevant models)

Model Reconstruction loss (MSE)

Transformer–VAE 0.058

GeoTrackNet [19] 0.046

DQTI-GRU 0.032

DQTI-LSTM 0.027

DQTI-Hybrid (Ours) 0 . 021

Table 7.
Performance comparison of DQTI-hybrid with different encoding resolutions.

Encoding resolution Total loss Reconstruction loss KL divergence Anomaly rate (Test)

Standard (260-bin) 2.69 2.66 0.03 0.5%

High-Resolution (560-bin) 5.19 5.10 0.09 1.1%