Abstract
Accurate mapping of hydrothermal alteration zones is critical for improving the efficiency of porphyry copper exploration. In the Kuhpanj porphyry copper district (Kerman, Iran), distinguishing phyllic, argillic and propylitic alterations from surrounding lithologies using satellite data remains challenging due to spectral complexity and spatial heterogeneity. This study proposes an improved semantic segmentation framework for spaceborne hyperspectral imagery, exploiting 41 bands of PRISMA data to delineate alteration zones at the deposit scale. A modified U-net architecture was developed that employs a dual-path design for the concurrent extraction of spectral and spatial features: one branch processes pixel-wise spectra across the 41 bands, while the second branch captures local spatial context within 256 × 256 × 41 patches. The performance of the proposed network was benchmarked against a V-net architecture using a confusion-matrix-based evaluation. The proposed model achieved an F1-score of 86% while being trained on a limited labelled dataset, and it requires substantially fewer trainable parameters than the reference architecture, highlighting its efficiency for data-constrained exploration scenarios. The results demonstrate that the new dual-path U-net significantly enhances the reliability of alteration mapping from PRISMA hyperspectral data and provides a computationally efficient deep-learning solution for processing high-dimensional geoscientific imagery. This contribution extends current applications of convolutional neural networks in mineral exploration by introducing a tailored architecture that improves both accuracy and model compactness for hyperspectral semantic segmentation.
Introduction
In geoscience, the discovery and evaluation of mineral resources have traditionally depended on detailed examinations of geological formations, a process often constrained by the extensive and inaccessible terrains typical of many mineralised regions. The analysis of lithological units, structural features and mineral assemblages provides essential evidence for locating potential mineral deposits and assessing their economic significance. However, numerous mineral-rich areas are situated in remote, rugged landscapes that lack infrastructure and present harsh environmental conditions, thereby increasing the difficulty and cost of exploration activities. These limitations restrict field access and hinder the collection of reliable geological data. To address such challenges, a range of technological approaches has been developed to facilitate mineral exploration in inaccessible terrains. Remote sensing methods, including satellite imagery and aerial surveys, have become valuable tools for detecting geological features and identifying prospective mineral targets from a distance. The integration of remote sensing into mineral exploration has expanded analytical capabilities beyond those offered by conventional field–based techniques, enabling the detection of subtle mineralisation patterns. Technologies such as X-ray fluorescence and hyperspectral imaging allow for precise, non-destructive characterisation of mineral compositions and subsurface structures.1,2 The combined use of datasets from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER), Advanced Land Imager and Hyperion sensors has significantly improved the identification of hydrothermal alteration minerals and delineation of alteration zones associated with porphyry copper systems.3,4 More recently, the PRISMA hyperspectral sensor has demonstrated strong potential in mineral exploration, particularly in arid environments such as Cuprite, Nevada, where it has been used effectively to map alteration minerals. 5 Comparative studies indicate that PRISMA's hyperspectral capabilities and relatively high spatial resolution support a wide range of applications, including forest type discrimination, post-fire land–cover mapping and mineralisation targeting.6,7 These advances, combined with spectral image–processing techniques such as principal component analysis, band ratios and minimum noise fraction transformations, have substantially enhanced the ability to map alteration features across large and remote areas with a level of precision not previously achievable.
Semantic segmentation of hyperspectral data is fundamental across multiple domains, including mineral exploration, urban development and environmental monitoring. Accurate delineation of alteration zones, road networks or other surface features within complex imagery is essential for informed decision-making; however, despite recent technological advances, the segmentation of hyperspectral data remains a challenging task due to its high dimensionality, spectral redundancy and sensitivity to noise. In recent years, deep learning models – particularly those based on Convolutional Neural Networks (CNNs) – have shown substantial promise in improving segmentation performance. Numerous studies have focused on designing novel architectures to enhance both the accuracy and computational efficiency of hyperspectral segmentation algorithms. For example, Baheti et al. 8 demonstrated that jointly processing multimodal hyperspectral and LiDAR datasets through a composite style–fusion framework, incorporating point-based CNN layers for LiDAR point clouds, yields higher pixel-level and mean accuracies compared to unimodal approaches. Their findings indicate that combining spectral and structural information can significantly improve segmentation reliability in remote sensing applications.
Advancements in deep learning architectures, including U-Net variants and point-based CNN models, have further strengthened the ability to map mineral alterations, extract road networks and process multimodal datasets for operational decision-support tasks. Within mineral exploration, considerable progress has been made through the development of image processing techniques aimed at analysing spatial and temporal satellite data. Fully CNNs now serve as a benchmark for evaluating change detection and object identification in multi-temporal imagery. Patel et al. 9 introduced a modified U-Net architecture designed for road–network segmentation, demonstrating that comparable accuracy can be achieved with fewer training samples and reduced architectural complexity, leading to shorter training times and strong performance across accuracy, Intersection over Union (IoU) and DICE metrics.
Similarly, Decker and Borghetti 10 investigated multimodal hyperspectral–LiDAR fusion using a composite style–fusion model combined with point-based CNN layers, achieving superior performance relative to unimodal baselines. Their subsequent work 11 introduced Y-Net, a densely connected Siamese network for multiclass change detection, employing dual-stream DenseNet encoders and an attention-based fusion strategy to better capture bi-temporal variations. Bidari et al. 12 further advanced the field by enabling hyperspectral data processing within point-convolution frameworks and proposing a composite fusion-style neural network for segmenting LiDAR point clouds. Comparative evaluations of these approaches emphasise the potential of innovative multimodal fusion strategies to enhance the interpretation of complex remote-sensing datasets.
Collectively, these studies highlight the critical role of advanced image processing techniques, deep learning architectures and multimodal data representation strategies in improving hyperspectral semantic segmentation for alteration mapping in mineral deposits. The integration of such methods not only enhances the precision of mineral exploration workflows but also facilitates efficient processing of high-dimensional hyperspectral datasets, supporting broader applications across geoscience and remote-sensing disciplines.13,14
However, the direct application of standard deep learning models to the high-dimensional and spectrally complex hyperspectral datasets used in mineral exploration has revealed notable limitations. Preliminary evaluations using multiple U-Net configurations with different backbone architectures, applied to hyperspectral images of 256 × 256 pixels and 41 spectral bands, consistently failed to generate reliable segmentation outputs. These outcomes indicate a significant methodological gap and emphasise the need for architectures capable of processing high-dimensional hyperspectral inputs directly, without relying on conventional preprocessing techniques such as band ratios, principal component analysis or minimum noise fraction transformations.
To address these challenges, the present study introduces a novel framework for the semantic segmentation of hyperspectral data, with a specific application to the Kuhpanj Porphyry Copper Deposit in Kerman, Iran. The proposed approach is designed to achieve three primary objectives:
Introduce a new architecture capable of directly processing raw hyperspectral data without dependence on traditional preprocessing techniques. Demonstrate the model's capacity to learn effectively from a limited number of annotated samples, thereby mitigating one of the principal constraints associated with applying deep learning to remote-sensing datasets. Compare the performance of the proposed dual-path architecture with that of the V-Net model, which has gained prominence in medical image segmentation for its rapid processing and high computational efficiency.
By fulfilling these objectives, the study contributes to ongoing advancements in mineral exploration and establishes a new benchmark for hyperspectral semantic segmentation. The following sections outline the methodological framework, present the experimental results and discuss the broader implications for enhancing alteration mapping and remote-sensing workflows within geoscience.
Geological setting
The Kuhpanj porphyry copper deposit is situated between latitudes 29°51′5.6″ N and 29°51′30″ N and longitudes 56°4′45.5″ E and 56°4′3″ E, with elevations reaching approximately 3120 m at the highest points. According to Dimitrijivic, 15 the region exhibits considerable geological complexity, characterised by diverse lithologies, mineral assemblages and alteration patterns. Extensive alteration zones typical of porphyry copper systems – including phyllic, argillic and propylitic assemblages – are prominently developed throughout the area. Detailed geological, petrological and mineralogical investigations at varying scales have documented the distinct physical and chemical characteristics of the host rocks, contributing to comprehensive geospatial interpretations reported by Gonbadi et al. 16 and Khosravi. 17 These alteration zones display diagnostic anomalies and mineral signatures that are essential for delineating exploration targets.
Structural analyses by Roshani et al. 18 describe a complex framework shaped by interactions among volcanic activity, sedimentation, tectonic deformation and erosional processes. This interplay has played a central role in the formation of mineralised zones within the deposit. Mineralogically, phyllic alteration is predominantly associated with muscovite and illite, while argillic alteration is characterised by kaolinite, montmorillonite and dickite. Propylitic alteration zones commonly contain epidote and chlorite. These minerals exhibit distinctive spectral features that can be detected using advanced remote-sensing technologies, including hyperspectral observations from the PRISMA satellite, enabling accurate discrimination and mapping of alteration patterns.
The geological and mineralogical attributes of the Kuhpanj region underscore its significance as a promising target for mineral exploration. Its position within the Urmia–Dokhtar volcanic belt – a well-known metallogenic province hosting numerous porphyry systems – further enhances its exploration potential and has contributed to sustained scientific and economic interest in the area.
Materials and methods
Data selection
The PRISMA satellite (PRecursore IperSpettrale della Missione Applicativa), developed by the Italian Space Agency, is an advanced Earth-observation platform designed to provide high-quality hyperspectral data. Launched on 22 March 2019 and operational as of May 2020, 19 PRISMA acquires imagery across 239 spectral bands spanning the 400–2500 nm range. 20 The system integrates a hyperspectral sensor with a medium-resolution panchromatic imager, enabling detailed spectral analysis together with contextual spatial information. 7 Its data have been applied across a broad spectrum of environmental and geoscientific studies, including post-fire land-cover assessment, mineral alteration mapping, snow and glacier surface characterisation, vegetation function monitoring and wildfire temperature estimation.6,21–23 PRISMA's capabilities have also supported forest-type discrimination, detection of non-photosynthetic vegetation and monitoring invasive marine macroalgae.24–26 Recent studies have shown its effectiveness in quantifying methane emissions and evaluating carbon dioxide output from power plants.27,28 Validation and calibration efforts have compared PRISMA measurements against radiative transfer model simulations, field spectrometer data and reflectance observations.29,30 Additional applications include agricultural soil characterisation, glacier ice analysis and water–quality investigations. 31 Table 1 summarises the principal features and capabilities of the PRISMA satellite.
PRISMA satellite specifications.
PRISMA data products are distributed across several processing levels, including Level 1 and Level 2, with more specific designations such as Level 1A, 1B, 1C and Level 2D. Each level represents a distinct stage of radiometric, geometric and atmospheric processing within the PRISMA data framework. Level 1 products, for example, have been employed in applications such as water–quality assessment, where radiometrically calibrated radiances are compared with simulated values derived from in situ measurements. 28 Level 2D products have been widely used in water–quality retrieval and mineral–alteration mapping, demonstrating the capability of PRISMA hyperspectral data to detect compositional variations, including alteration minerals.5,32
In this study, Level 2D data are utilised. These products incorporate standard geometric and atmospheric corrections, thereby improving the accuracy, interpretability and overall reliability of the hyperspectral measurements.
To identify PRISMA spectral bands relevant to phyllic, propylitic and argillic alteration zones in the Kuhpanj porphyry copper deposit, guidance was taken from studies that have examined similar hydrothermal systems using spaceborne multispectral and hyperspectral data. Zhang et al., 33 for example, mapped hydrothermal alteration in the Duolong porphyry Cu–Au deposit using ASTER and Landsat 8 OLI imagery, demonstrating the effectiveness of ASTER short-wavelength infrared (SWIR) and Landsat VNIR/SWIR bands in delineating phyllic, argillic and propylitic assemblages. Similarly, Abedi et al. 34 highlighted the diagnostic value of SWIR regions for discriminating hydrothermal alteration minerals associated with porphyry copper systems. Their findings emphasise that key absorption features characteristic of muscovite–illite (phyllic), kaolinite–montmorillonite (argillic) and chlorite–epidote (propylitic) are prominently expressed in SWIR wavelengths.
Additional evidence from Bolouki et al. 35 demonstrated that band ratio and relative absorption band depth techniques applied to ASTER SWIR data enable robust detection of advanced argillic, argillic–phyllic and propylitic zones. The spectral bands identified in these studies provide a reliable basis for selecting PRISMA wavelengths sensitive to similar mineralogical features in the Kuhpanj deposit.
The selection of 41 PRISMA spectral bands for alteration mapping in the Kuhpanj porphyry copper deposit is based on their sensitivity to diagnostic absorption features associated with minerals characteristic of propylitic, argillic and phyllic alteration zones. This targeted band selection follows the detailed spectral analysis presented by Esmaeili et al., 36 who identified specific PRISMA wavelengths corresponding to key spectral signatures of epidote and chlorite (propylitic), kaolinite, montmorillonite and dickite (argillic), as well as muscovite and illite (phyllic). These minerals exhibit well-defined absorption features primarily in the SWIR region, enabling reliable discrimination of alteration assemblages using selected subsets of PRISMA's hyperspectral range. For integration into the deep learning model, the hyperspectral cube containing the selected 41 bands was spatially cropped to a standardised dimension of 256 × 256 pixels to ensure uniformity across training and evaluation datasets.
Using the full set of 41 hyperspectral bands directly in the model, rather than reducing the data to three-channel (RGB) imagery through band ratio transformations, PCA or similar dimensionality reduction techniques, provides several methodological advantages. First, the complete spectral representation retains the enhanced spectral resolution necessary for capturing diagnostic absorption features of alteration minerals, many of which cannot be distinguished in RGB space. Second, the availability of detailed spectral information improves classification accuracy, as deep learning architectures can exploit subtle inter-band variations that are critical for differentiating complex geological materials. Third, direct analysis avoids the information loss typically associated with RGB conversion or dimensionality reduction methods, ensuring that the full spectral richness contributes to the model's learning process. Finally, the use of multiple narrow spectral bands facilitates the detection of fine-scale mineralogical variations and alteration patterns, enabling more advanced geological interpretation than would be possible with three-channel imagery.
Training point
In Figure 1, the location of the Kuh Panj area within the Urmia–Dokhtar magmatic arc and the 1:5000 geological map of the deposit are displayed. In addition to the geological framework and the spatial distribution of hydrothermal alteration, Figure 1 also shows the location of the training points used for constructing the mask image employed in network training. This mask image was defined in five classes, namely phyllic, argillic and propylitic alteration zones, the non-altered/background domain, and a class of unidentified pixels.

The location of the Kuh Panj area within the Urmia-Dokhtar magmatic arc and the 1:5000 geological map of KuhPanj (Khosravi 17 ).
To create this mask, a total of 30 labelled points were selected from the alteration map compiled in previous studies, 17 and each point was assigned to one of the four main categories (phyllic, argillic, propylitic and background), while pixels outside these categories were treated as unidentified. Due to the limited number of labelled points and their uneven distribution among alteration classes, the resulting dataset is inherently imbalanced. To mitigate this issue, class weights proportional to the inverse frequency of each class were incorporated into the loss function during model training. The weight associated with the unidentified class was set to zero, so that this class did not contribute to the optimisation process.
The labelled points were then projected onto the 256 × 256 PRISMA hyperspectral cube (41 bands), and all pixels falling within the mapped training polygons were used to generate patch-based samples for network training and validation. Specifically, fixed-size patches of 256 × 256 × 41 were extracted such that each patch contained at least one labelled pixel of the target classes, and the corresponding segmentation masks were derived from the five-class label image. The available labelled samples were randomly divided into 70% for training and 30% for validation, ensuring that pixels from the same spatial neighbourhood were not split across both sets to reduce spatial leakage and overly optimistic accuracy estimates. No separate test set was defined from additional scenes; instead, model performance was assessed on the held-out validation subset, complemented by data augmentation (e.g., random rotations, flips and mild spectral perturbations) applied only to the training patches to increase the effective sample size and to improve generalisation under the constraint of sparse ground-truth information.
Deep learning models
In this research, two deep learning–based models were employed to delineate alteration units within the Kuh Panj mineralisation zone. The first model is a modified variant of the U-Net architecture. U-Net, originally introduced in 2015 for biomedical image segmentation, is a CNN widely recognised for its effectiveness in semantic segmentation tasks. 37 The architecture consists of an encoder–decoder structure in which the encoder extracts contextual information through successive downsampling operations, while the decoder restores spatial resolution via upsampling. Skip connections link corresponding encoder and decoder layers, facilitating the fusion of low- and high-level features and thereby enabling precise pixel-level segmentation. 38
Due to its flexibility and strong feature-representation capabilities, U-Net has been adopted across diverse domains. In medical imaging, it has been applied to brain tumour segmentation, 39 skin-lesion delineation 40 and lung CT segmentation for COVID-19 detection. 41 In remote sensing, modified U-Net architectures have been used for automatic road extraction 9 and building segmentation. 42 Applications also extend to physics, including the segmentation of Galactic filaments. 43 Numerous variants – such as U-Net++, which integrates multiscale feature aggregation, 44 and TMD–U-Net, which incorporates multi-scale inputs and dense skip connections45,46 – have been proposed to address domain-specific segmentation challenges.
For this study, the U-Net architecture was modified to process the full spectral depth of the hyperspectral cube, allowing the network to analyse spectral variations for each pixel across all 41 PRISMA bands. These modifications enable the extraction of spectral–spatial features relevant to hydrothermal alteration, resulting in a customised architecture developed as follows:
In the U-Net-based model developed for high-dimensional hyperspectral image segmentation, a dual-pathway architecture was designed to capture both the spatial and spectral relationships present in the data.
The spatial pathway begins with a Conv2D layer containing 16 filters and employing a Gaussian Error Linear Unit (GELU) activation function with the same padding. L2 regularisation is applied to limit overfitting. Following convolution, a MaxPooling2D operation reduces spatial dimensions by a factor of two, emphasising dominant spatial structures. Batch normalisation is then applied to stabilise and accelerate the learning process, and a dropout layer with a rate of 0.7 further improves generalisation by randomly deactivating feature detectors during training. This sequence – Conv2D, MaxPooling, batch normalisation and dropout – is repeated with increasing filter depths of 32 and 64, allowing the network to progressively extract more abstract spatial representations.
The spectral pathway reshapes the input tensor to prioritise spectral channels and applies a series of Conv1D layers with 16, 32 and 64 filters, each using GELU activation and same padding. Each convolution is followed by MaxPooling1D, batch normalisation and dropout, mirroring the spatial branch but operating along the spectral dimension. A key element of this pathway is the inclusion of a bidirectional GRU layer, which processes the ordered spectral sequence and captures long-range dependencies across wavelengths, thereby enabling contextual spectral interpretation.
The two branches are then merged via concatenation, integrating the complementary spatial and spectral feature representations. The combined feature maps are passed through a series of Conv2DTranspose layers forming the decoder portion of the U-Net, where upsampling progressively restores spatial resolution. Skip connections from corresponding levels of the spatial encoder are incorporated to preserve fine-scale spatial details and improve gradient flow during training.
The network concludes with a Conv2D output layer employing a softmax activation function to assign each pixel to one of the predefined alteration classes, producing a segmentation map that distinguishes the various hydrothermal alteration units within the study area. A schematic representation of the dual-input architecture is provided in Figure 2.

A model with two inputs for processing the spatial and spectral parts of the data.
The second model employed for alteration-unit segmentation is a modified version of the V-Net architecture adapted for two-dimensional data. The original V-Net, a 3D fully convolutional architecture, has been widely utilised in medical imaging for segmentation and classification tasks. Its applications include right-ventricle segmentation in gated myocardial perfusion images, 47 multi-organ segmentation in abdominal CT volumes 48 and gross tumour volume delineation in PET imaging for head-and-neck cancer. 49 Several studies have proposed enhanced variants of V-Net, such as the Spatial-Temporal V-Net for automated segmentation 50 and the IE-VNet for inner ear fluid space segmentation.51,52 Comparative analyses have also assessed its performance relative to other architectures; for example, 3D U-Net demonstrated superior segmentation accuracy over V-Net and 3D Res U-Net for intracranial aneurysm segmentation. 53 These studies highlight the importance of anatomical contextual information and the distinctions between slice-based 2D segmentation and full 3D volumetric segmentation.
To adapt V-Net for 2D image data, several architectural modifications have been explored. For instance, VUMix-Net integrates a 3D V-Net with a 2D U-Net to improve gross–tumour–volume segmentation in oesophageal cancer CT data. 54 Additionally, the nnU-Net framework has been used to compare 2D slice-wise and 3D volumetric predictions, demonstrating its ability to handle both modes effectively. 55 Previous work has also introduced modified V-Net architectures specifically optimised for 2D datasets, showing improved segmentation performance across a range of medical imaging tasks. 56 The modified V-shaped model introduced by Dodia et al. 56 has similarly been adopted in this research, and the corresponding architecture is illustrated in Figure 3.

The V-net model used for the separation of hydrothermal alterations in the two-dimensional PRISMA data with dimensions of 256*25*641. 56
For the activation functions in this dual-path model, the GELU activation function was used for all layers except the output layer, where a softmax function was required for multi-class classification. Several activation functions were evaluated, but GELU consistently produced superior results. The GELU was introduced by Hendrycks and Gimpel
57
and has gained substantial adoption due to its ability to alleviate vanishing gradient issues, improve training efficiency and enhance overall model accuracy.
58
Its smooth non-linear transformation avoids the sharp discontinuity found in ReLU at the origin, enabling more stable gradient flow.
59
The GELU has demonstrated advantages over ReLU and Sigmoid in tasks such as image classification and target detection.60,61 It has also been integrated into various neural architectures to strengthen generalisation and detection performance.
62
The GELU is defined as follows
57
:
where x: represents the input to the activation function; in deep neural networks, x corresponds to a neuron's pre-activation value, typically produced by a linear transformation:
And Φ(x) is the CDF of a standard normal distribution N(0,1). It gives the probability that a normally distributed random variable Z is less than or equal to x:
This introduces probabilistic smoothness into the activation behaviour of GELU and erf() represents the normalised integral of the Gaussian distribution. In this context, erf() is used to compute the CDF of the standard normal distribution.
A computationally efficient approximation of the Gaussian CDF is:
And the sigmoid (t) is:
For both proposed models, the Adam optimizer was employed due to its efficiency and adaptive learning capability in deep neural network training. The initial learning rate was set to 0.01 and progressively reduced using an ExponentialDecay schedule with a decay rate of 0.90, allowing the optimisation process to begin with faster convergence and gradually stabilise during later training stages.
The training objective was defined as a hybrid loss function composed of three complementary components: Jaccard Loss (IoU), Categorical Cross-Entropy (CCE) Loss and Dice Loss. Each loss function contributes distinct advantages for semantic segmentation tasks. Categorical Cross-Entropy focuses on pixel-wise classification accuracy, while Dice Loss and Jaccard Loss emphasise spatial overlap between predicted and ground truth regions. These overlap-based metrics are particularly effective for handling class imbalance and improving segmentation performance for minority classes.
By integrating these three loss functions, the optimisation process simultaneously promotes accurate pixel classification, improved boundary delineation and robust segmentation performance across heterogeneous classes. Such characteristics are especially important for hyperspectral alteration mapping, where subtle spectral differences and spatial discontinuities must be accurately captured. The combined loss formulation enhances the model's sensitivity to small or spectrally distinct alteration zones while maintaining stable training behaviour.
Various loss functions and combinations were evaluated during model development. However, the weighted combination of Jaccard, CCE and Dice losses yielded the most stable convergence and the best segmentation performance for both the modified V-Net and the dual-path U-Net architecture. The weighted formulation of the combined loss function is expressed as follows:
where
Results
Characteristics of the models
After defining the architectures used for alteration unit segmentation in the Kuhpanj porphyry copper prospect, both deep learning models were trained using a set of hyperparameters described in the preceding sections. To improve generalisation and prevent overfitting, data augmentation techniques were applied during training while ensuring that the integrity of the segmentation masks was preserved (Figure 1). Given the semantic segmentation nature of the task, only augmentation methods that maintain consistent pixel-to-label correspondence were employed.
During model training, the hybrid loss function exhibited stable convergence for both architectures. For the dual-path U-Net model, the training and validation loss values reached minimal and consistent levels after approximately 50 epochs, indicating stable optimisation. For the modified V-Net model, a comparable convergence trend was achieved after approximately 70 epochs. The corresponding loss–curve plots for both models are presented in Figure 4.

Comparative convergence of loss function values for V-net (right) and dual path (left) models over epochs.
Table 2 provides a concise comparison of the architectural complexity and memory requirements of the Dual-Path and V-Net models used in this study. The Dual-Path model demonstrates a highly compact design, comprising 424,837 trainable parameters and occupying only 1.62 MB of storage. In contrast, the modified V-Net architecture exhibits substantially higher computational complexity, with 23,765,093 trainable parameters and a corresponding memory footprint of 90.66 MB. This disparity highlights the computational efficiency of the Dual-Path model, which offers a considerably lower resource demand and thus enables faster training iterations and reduced hardware constraints during deployment.
Comparison of trainable parameters and model size between dual path and V-net models.
Regarding hardware configuration, both models were trained using a Google Colab Pro environment equipped with an NVIDIA T4 GPU and high-memory runtime, ensuring stable execution and accommodating the increased memory requirements of the V-Net model.
Evaluation of the models
For the validation and comparative assessment of the classification outputs generated by the V-Net and Dual-Path models, ground truth samples obtained from previous studies within the Kuhpanj prospect were utilised. In total, 32 field samples were incorporated, compiled from earlier investigations.63–65 These samples consist of 10 propylitic, 16 argillic and 6 phyllic alteration specimens. Their spatial distribution is illustrated in Figure 5, superimposed on a false colour composite image of the region constructed using PRISMA SWIR bands 50–100–165.

Sample points location on the false-colour composite image (50–100–165) in short-wavelength infrared [SWIR] of Kuhpanj area.
Additionally, petrographic examinations were performed on selected thin sections (Figure 6), confirming the presence of epidote and chlorite in propylitic alteration zones and sericite (muscovite + illite) as characteristic minerals within phyllic alteration zones. A detailed summary of these mineralogical observations is provided in Table 3.

Microphotographs of four sample points: (a) Sample of Propylitic alteration with the presence of epidote (b) Andesite sample with phyllic alteration and presence of sericite and pyrite veinlets (c) Granodiorite sample with phyllic alteration and presence of crossed nickel veinlets (d) Andesite sample with propylitic alteration and presence of epidote and chlorite.63,65
The confusion matrix related to the dual path model.
The bold values indicate the best-performing values and the main summary metrics reported for each model, specifically the overall performance measures.
To evaluate and compare the performance of both models in distinguishing background or non-altered areas, an additional set of 13 reference points was selected from regions categorised as unaltered. These points were identified using the 1:5000 geological map of the Kuhpanj area, which provides detailed spatial information on zones lacking hydrothermal alteration. Since these locations fall outside the alteration footprints, they are not displayed in Figures 1 or 5, ensuring that the comparison focuses specifically on non-altered ground conditions.
The output of the Dual-Path model for the Kuhpanj mineralisation area, generated at a spatial dimension of 256 × 256 pixels, is presented in Figure 7 and for V-Net model in Figure 8. The model's predictions were assessed using the previously described ground truth samples in conjunction with a confusion–matrix-based evaluation in Tables 3 and 4. This analysis yielded a Weighted F1 score of 86% (Table 3), indicating strong overall segmentation performance across the alteration classes.

The model for alteration units output from the Dual Path model, where the pink colour indicates the distribution of phyllic alteration, the yellow colour the distribution of argillic alteration and the green colour the distribution of propylitic alteration (Color online).

The model for alteration units output from the modified V-net model, where the pink colour indicates the distribution of phyllic alteration, the yellow colour the distribution of argillic alteration and the green colour the distribution of propylitic alteration (Color online).
The confusion matrix related to the modified V-net model.
The bold values indicate the best-performing values and the main summary metrics reported for each model, specifically the overall performance measures.
Discussion
Based on the Weighted F1 scores obtained for both models, their overall performance can be considered largely comparable, with both architectures exhibiting nearly identical results in the classification of phyllic and argillic alteration units. The primary performance differences arise in the detection of propylitic alteration and background (non-altered) areas, where both models demonstrate lower accuracy relative to the other two alteration classes. This reduced performance may be associated with factors such as the limited spatial extent of the study area, the restricted number of available training samples and the heterogeneous distribution of diagnostic minerals characterising propylitic alteration.
Between the two models, the modified V-Net exhibits slightly weaker detection capability for these categories, achieving a Weighted–F1 score of approximately 81%, compared with 86% for the Dual-Path model. It is noteworthy that this difference occurs despite the substantial disparity in architectural complexity: the Dual-Path model contains only 424,837 trainable parameters (1.62 MB), whereas the modified V-Net comprises 23,765,093 parameters (90.66 MB). These results indicate that compact deep learning architectures can effectively capture pixel-level spectral–spatial signatures in hyperspectral data and establish meaningful correlations for mineral exploration tasks, even with minimally sized training datasets.
One of the central objectives of this research was to generate a reliable alteration map of the study area using deep learning methods and a limited number of training samples. The performance achieved by both models – particularly the Dual-Path architecture – demonstrates the feasibility of this approach.
This study was conducted on a single hyperspectral PRISMA scene with dimensions of 256 × 256 pixels and 41 selected bands over the Kuhpanj porphyry copper prospect. Consequently, the current experimental evidence is restricted to a relatively small image patch and a specific geological setting. As we explicitly acknowledge, the performance of the proposed dual-path U-Net may vary for larger scenes, for images acquired under different illumination and atmospheric conditions or for deposits hosted in different lithological and structural frameworks. In such cases, the spatial context, background variability and class boundaries can become more complex, potentially affecting segmentation accuracy and class balance.
Nevertheless, the design of the method is not tied to Kuhpanj itself, but rather to the generic problem of spectral–spatial semantic segmentation of hyperspectral imagery. The model directly processes multispectral representations derived from hyperspectral sensors without requiring dimensionality reduction techniques such as PCA or MNF, or hand-crafted false colour composites, and it has been adapted from a medical image architecture to explicitly learn both spectral and spatial features. Since PRISMA data have already been used in a wide range of applications (e.g., vegetation, water quality, atmospheric and emission monitoring, soil and ice characterisation), the same modelling strategy can, in principle, be transferred to other regions and exploration targets, provided that suitable training labels are available and that the network is re-trained or fine-tuned on the new data.
From a practical exploration perspective, the limited amount of labelled data available at Kuhpanj realistically reflects common constraints in early-stage mineral projects. The fact that the modified U-Net attains a weighted F1-score of 86% under such sparse supervision suggests that the approach is robust enough to be considered for other under-sampled deposits. However, we emphasise that further validation on larger PRISMA scenes and on additional porphyry and non-porphyry systems is required before drawing definitive conclusions about its generalisation capability. Future work should therefore include (i) testing on larger, more heterogeneous hyperspectral scenes, (ii) evaluating the model on other sensor platforms (e.g., high-spatial resolution multispectral sensors such as WorldView-3) and (iii) comparing the present architecture with more recent transformer-based models tailored for spectral–spatial modelling.
Conclusions
The present study demonstrates the potential of deep learning–based semantic segmentation for hyperspectral alteration mapping in porphyry copper exploration. The principal findings and contributions are summarised as follows:
A modified dual-path U-Net architecture was developed to simultaneously capture spectral and spatial information from PRISMA hyperspectral imagery, enabling direct processing of 41 spectral bands without applying dimensionality reduction techniques such as principal component analysis or false colour composites. The proposed framework successfully delineated hydrothermal alteration zones in the Kuhpanj porphyry copper prospect, achieving a weighted F1-score of 86%, confirming the effectiveness of the model for mineral exploration applications. The results demonstrate that reliable hyperspectral semantic segmentation can be achieved even with a highly limited set of labelled training samples, highlighting the applicability of deep learning methods in data-constrained geological environments. Compared with the redesigned V-Net benchmark, the proposed model achieves competitive segmentation performance while requiring substantially fewer trainable parameters, indicating improved computational efficiency.
Overall, the proposed approach provides an efficient and accurate deep learning framework for hyperspectral alteration mapping and contributes to advancing the integration of artificial intelligence methods into mineral exploration workflows.
Supplemental Material
sj-jpg-1-ape-10.1177_25726838261460116 - Supplemental material for Semantic segmentation of hyperspectral data for identifying mineral alterations in the Kuhpanj porphyry copper deposit using an improved U-Net architecture
Supplemental material, sj-jpg-1-ape-10.1177_25726838261460116 for Semantic segmentation of hyperspectral data for identifying mineral alterations in the Kuhpanj porphyry copper deposit using an improved U-Net architecture by Puyan Javandel, Nader Fathianpour and Seyed Hassan Tabatabaei in Applied Earth Science
Footnotes
Acknowledgement
The authors gratefully acknowledge the Italian Space Agency (Agenzia Spaziale Italiana, ASI) for granting access to the PRISMA hyperspectral data used in this study. The availability of these data substantially supported the analysis of mineral alteration patterns in the Kuhpanj porphyry copper deposit.
Funding
The authors received no financial support for the research, authorship and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
