Abstract
Urban regeneration intensifies long-standing challenges of spatial fragmentation, declining visual diversity, and weakened local identity driven by rapid urbanization globally. Urban visual features, as the most immediate layer of urban form, strongly shape perception and behavior. Yet, the distinct roles of urban architectural color and style remain poorly specified. This study addresses the central question of which visual attribute exerts stronger effects—color or style—by developing an integrated framework that combines visual computing, spatial quantification, and perception modeling. Meizhou Island in southeast China provides the empirical setting. Color attributes—complexity, harmony, saturation, and value—were extracted from street-view images, while architectural styles were identified through a Convolutional Neural Network (CNN) model. Ordinary Least Squares (OLS) regression and eXtreme Gradient Boost (XGBoost) were employed to test associations and interaction effects. Findings show that color features, particularly complexity, have higher predictive power for both perception and behavior than architectural style. However, high color complexity and saturation, while enhancing visual appeal, reduce behavioral engagement measured by visit frequency from geo-tagged user data, exposing a perception–behavior gap. Architectural style exerts weaker effects and functions primarily as a symbolic cue rather than a behavioral driver. Moreover, urban morphological characteristics, especially functional diversity, significantly moderate how visual attributes influence outcomes across different levels of complexity. The findings highlight the need to integrate color strategies into regeneration, and to align visual expression with spatial function to foster perception- and behavior-driven urban interventions.
Introduction
As global urbanization shifts into an era of stock-based development, cities increasingly confront challenges of spatial incoherence, visual homogenization, and the erosion of local identity (Cordeiro et al., 2024; Lemoine-Rodríguez et al., 2020; Senderos et al., 2025). Against this backdrop, visual environmental quality has become central to assessing urban quality of life (QoL), particularly within the context of urban regeneration and spatial governance across countries (Lin et al., 2025; Ye et al., 2019). International agendas reinforce this point; UN-Habitat identifies cultural legibility and visual quality as levers for “cities for all” (UN-Habitat, 2020). Similarly, the European Union’s Urban Agenda stresses the role of heritage and design in neighborhood regeneration (European Commission, 2019). The appearance of urban visual features shapes well-being, anchors place attachment, and mediates how people engage with the city. Urban architectural color and style, in particular, operate as visible cues that trigger impressions and affective responses. As urban data proliferates, visual computing now enables urban aesthetics assessment, linking the physical environment and people’s perceptions and spatial behaviors (Sousa et al., 2023; Xue et al., 2025). This shift makes it urgent to examine how color and style drive visitors’ perception and behavior—a task crucial for advancing human-centered regeneration strategies.
Currently, research on urban architectural color and style has moved beyond aesthetics and regulation toward perception–behavior paradigms that stress measurement, comparison, and generalization. Early work rooted in environmental psychology established color as a behavioral stimulus (Bacon, 1976), most notably through the stimulus–organism–response (SOR) model, which links environmental cues to affective states and behavioral reactions (Mehrabian and Russell, 1974). Subsequent studies emphasized the symbolic and emotional roles of color, connecting it to identity, atmosphere, and emotional regulation (De Mattiello and Rabuini, 2011; Tosca, 2002). The digital turn has accelerated this shift. Advances in image processing now enable precise extraction of dominant urban visual features (Zhou et al., 2023), allowing color distributions and spatial structures to be quantified through clustering and chromatic typologies (Nguyen et al., 2020; Xue et al., 2025). Architectural style research has followed a similar path, evolving from manual classification to algorithm-driven recognition using computer vision (Hidayetoğlu et al., 2010; Kang et al., 2018; Krier, 1988). Urban form provides the spatial context in which these visual cues operate. Research consistently shows that enclosure, connectivity, density, and continuity shape perceptions of safety, comfort, and legibility, while block- and façade-scale morphology influences aesthetic preference, pedestrian activity, and place perception (Angel et al., 2024; Ewing and Handy, 2009). Urban form is thus not a passive backdrop but an active mediator that interacts with color and style to structure public perception and spatial behavior (PPSB) (Lu et al., 2025; Zhao et al., 2025). Yet much of the literature still treats color and style in isolation, leaving their relative and joint influence underexplored (Chen et al., 2025; Ng, 2020). At the street level, both function as salient and immediately legible visual cues (Cao et al., 2025; Wang et al., 2025). In practice, they are central tools in urban regeneration, where recoloring, style restoration, and façade renewal are routinely used to reshape urban identity (Fan et al., 2023). Therefore, examining how color and style jointly shape PPSB advances both theory and design practice.
Additionally, the measurement of PPSB also shifted from conventional survey-based approaches to data-intensive, algorithm-driven approaches. Earlier survey-based questionnaires struggled with small samples, recall bias, and time lags, limiting their ability to capture diverse responses in dynamic urban settings (Ewing and Handy, 2009). The spread of geo-tagged social media has changed this terrain. Researchers now mine text, images, and activity traces to build large-scale, real-time models of perception and behavior (H. Song et al., 2025; Chen et al., 2026). Such data carry spatial anchors, temporal continuity, and individual heterogeneity, enabling the capture of lived experience at scale. In perceptual modeling, natural language processing (NLP), sentiment analysis, and topic modeling extract multi-dimensional indicators—from emotional polarity to place attachment—out of user-generated content (UGC) (J. Chen et al., 2025; Q. Song et al., 2025). On the behavioral side, geo-tagged check-in data serve as proxies for actual spatial practices (Zhang et al., 2020). Techniques such as kernel density estimation (KDE), spatial autocorrelation, and hotspot clustering reveal high- and low-activity zones, identify voids, and map patterns of vitality (Orellana and Guerrero, 2019; Yang et al., 2024). These methods help expose how online social media and offline physical visual features co-produce PPSB.
Despite progress, two limitations remain. First, most studies predominantly adopt either subjective perceptual evaluation or objective image-based analysis, while integrated approaches that combine both dimensions remain relatively underexplored. This divide often yields discrepancies, as computational metrics of visual quality frequently diverge from lived experience. Moreover, the relative influence of color and style has seldom been examined in parallel (Munteanu, 2023; N. Chen et al., 2025). The spatial patterns, perceptual pathways, and behavioral consequences remain poorly understood. Comparative analyses across dimensions are rare, and mechanism-based inquiries rarer still (He et al., 2025). To bridge these gaps, this study takes Meizhou Island, Fujian, as a case. It proposes an integrated framework that merges street-view imagery, deep visual computing, urban form metrics, and social media mining. Color features—complexity, harmony, saturation, and value—are extracted from images, while architectural styles are classified via convolutional neural networks (CNN). These visual attributes are linked with urban form indicators, geo-tagged texts of public sentiment, and spatial behavior from check-in records. Ordinary least squares (OLS) and eXtreme Gradient Boosting (XGBoost) models are then applied to test marginal effects and interaction mechanisms. On this basis, the study addresses three questions: (1) To what extent do urban architectural color and style influence PPSB? (2) What interaction effects among visual and morphological features shape PPSB?
At the theoretical level, this study advances an integrated framework that combines visual computing, spatial econometrics, and perception–behavior modeling. The framework offers a cross-scale and cross-dimensional pathway to examine the causal links between urban visual attributes and public responses, thereby extending the theoretical boundaries of visual urban studies. At the practical level, the findings provide quantitative support for visual governance in urban regeneration, the design of public spaces, and the creation of perception-friendly cities. They also inform perception-oriented design interventions and spatial strategies that strengthen the positive influence of the built environment on public psychology and behavior.
Literature review
Research methods and trends in urban color and style analysis
As core visual elements of urban space, architectural color and style are central to how people form spatial cognition, emotional responses, and aesthetic judgements in everyday urban life. Color shapes sensory and affective states—thermal comfort, order, restorativeness—and evokes emotions ranging from familiarity to estrangement (Lee et al., 2025; Zhou et al., 2023). Style, by contrast, operates through symbolic codes: it signals history, exoticism, or locality, and thereby calibrates judgments of urban quality (Xu et al., 2023).
Research practice has shifted markedly over time. Early work depended on human judgment—histograms in RGB space for color, typological readings of ornament and proportion for style (Krier, 1988). With the rise of computer vision, color analysis moved toward Lab-based extraction, harmony detection, and clustering of saturation values, producing higher-resolution descriptors (Nguyen et al., 2020; Zhou et al., 2023). Style recognition followed a similar leap; CNN and transfer learning models, such as the Visual Geometry Group network (VGG) and the Residual Network (ResNet), have been widely applied to automate façade classification, enabling city-scale mapping and style identification (Xu et al., 2023; Yang et al., 2022)
Data infrastructures have also changed the field. Google Street View and similar platforms now supply massive, standardized, and frequently updated imagery. Combined with segmentation and object detection, they enable the extraction of building outlines, imagery cleaning, and visible-area calibration (Wu et al., 2023; Zhou et al., 2023). This technological pathway enables large-scale, objective, and scale-consistent extraction of architectural visual variables, providing a robust basis for subsequent perceptual modeling and spatial analysis.
Yet despite advances in the automatic recognition of building color and style, most existing studies remain focused on the isolated extraction and descriptive display of visual variables. Integrated modeling that links these variables to public perception and behavioral responses is still at an exploratory stage. Some attempts have been made to combine convolutional image outputs with cognitive survey data in regression models (Q. Song et al., 2025; Yang et al., 2024). However, systematic comparisons of the explanatory power of different visual variables are still missing, and mechanism-level analyses remain underdeveloped (Kasraian et al., 2021; Sousa et al., 2023).
Advances in measuring PPSB
The measurement of PPSB is being reshaped by artificial intelligence and urban computing. What once depended on questionnaires and interviews, constrained by small samples and recall bias (Ewing and Handy, 2009), now draws on large-scale digital traces—from social media posts and street-view imagery to mobility trajectories—that capture urban experience in real time (Ito et al., 2024; Ni et al., 2025; Wu et al., 2023). This shift improves timeliness and scale, while also pushing methods from static description toward semantic modeling, causal inference, and interaction detection.
On the perceptual side, research has moved beyond extracting simple sentiment polarity or binary positive–negative scores. New approaches now incorporate topic modeling, cognitive structure construction, and spatial heterogeneity detection to unpack cultural affiliation, attribution patterns, and subjective spatial boundaries embedded in public texts (Yang et al., 2024, 2025). Guo and Yang (2025) combined Linear Discriminant Analysis (LDA) with vector representations to map nonlinear links between urban elements and public expressions across scales. Wu et al. (2023) embedded text analysis within street-view contexts, grounding subjective narratives in visible environments. Together, these approaches start to resolve semantic ambiguity and improve comparability in perception measurement.
Behavioral analysis has advanced in parallel. Instead of treating check-ins as static points, scholars now model trajectories, dwell times, and behavioral sequences. The focus has shifted to “path–destination–perception” linkages. Orellana and Guerrero (2019) developed a network-based approach to infer cycling behavior, incorporating structural attributes of transport paths into probability modeling. In a similar vein, Yang (2024) examined walking behaviors in commercial districts and identified response pathways to environmental perception, revealing temporal lags and moderating variables between space and behavioral intentions. Yet despite these advances, check-ins remain a widely used proxy for spatial engagement—particularly in tourism-oriented or high-density urban contexts, where continuous tracking is impractical and platform data provide broad coverage (Huang, 2022; Zhang et al., 2020). This dual trajectory highlights both the sophistication of emerging methods and the enduring value of check-in data in mapping everyday urban behavior.
Moreover, multimodal fusion has become a major trend in recent years. A hybrid approach integrating street-view imagery with 3D simulation data has been developed to construct perception maps at the meso-spatial scale (Yosifof and Fisher-Gewirtzman, 2024). Zhang (2020) combined imagery and user-generated content to detect “visually inconspicuous areas,” addressing blind spots missed by check-in data alone. These studies demonstrate a broader movement: toward heterogeneous data integration, parallel modeling strategies, and sharper mechanism identification in perception–behavior research.
Research framework
This study proposes an integrated framework to examine how urban color and architectural style influence PPSB (Figure 1). Color is represented through façade-level indicators extracted from street-view imagery. Style is identified using deep learning–based classification of building façades. Urban form is measured with indicators of density, diversity, and design (Cervero, 1997). Public perception is proxied by sentiment analysis of geo-tagged texts, while public behavior is captured through check-in density mapping. Analytical strategies combine regression models and machine learning to evaluate linear and nonlinear effects, with explainable AI methods used to interpret variable importance and interaction mechanisms. The framework aims to assess the distinct and significant effects of color and style on PPSB, comparing their relative explanatory power to identify potential mismatches. It also seeks to provide empirical evidence that can inform the integration of visual strategies into perception-driven and behavior-sensitive urban regeneration, ultimately contributing to more effective and targeted urban renewal practices. Research framework.
Data and methods
Study area
Meizhou Island, located off the coast of Fujian Province in southeast China, provides the empirical ground for this study (Figure 2). The island covers 14.35 km2, faces the Taiwan Strait, and is characterized by compact settlements and a dense street network. Culturally, Meizhou Island is the birthplace of Mazu culture and retains representative examples of Minnan-style architecture, noted for their iconic roof structures, decorative vocabularies, and vivid chromatic expression. Meizhou Island’s architectural landscape blends traditional Minnan, modernist, and pragmatic styles, where cultural symbolism intersects with contemporary practices, offering a rich context to compare perception differences across color and style. Functionally, Meizhou Island serves a dual role. It is both a sacred pilgrimage destination and a tourism-oriented coastal city. This duality attracts steady flows of visitors, while also generating abundant user-generated content on social media platforms. These combined attributes—cultural significance, architectural diversity, and rich digital traces—make Meizhou Island a representative case for examining how color and style shape PPSB. Research area: Meizhou Island, Fujian Province. (Data Source:(a-b): https://bzdt.ch.mnr.gov.cn/index.html;(c). https://www.openstreetmap.ch;(d) https://image.baidu.com/).
Research methods
Spatial unit construction and data integration
To integrate visual, perceptual, behavioral, and morphological data, all datasets were harmonized within a common spatial framework. The study area was divided into 50 m × 50 m grid cells, selected to capture street-level variation while preserving data density in Meizhou Island’s compact built environment. Street-view images were georeferenced along survey routes and assigned to grid cells; visual indicators were aggregated using mean values and spatial smoothing. Geo-tagged social media metrics and OpenStreetMap-derived urban form indicators were likewise matched and summarized at the grid level. Although visual features were extracted from 343 images, all indicators were spatially aggregated using kernel density estimation, yielding a consistent grid-cell dataset for analysis. All statistical and machine-learning models were therefore estimated at the grid level (n ≈ 1,310).
Urban architectural color indicators
Prior to color extraction, all street-view images underwent standardized preprocessing to reduce lighting-related noise and ensure robust color measurement. Images were collected along a predefined route across Meizhou Island’s built-up areas between June 10 and 17, 2024, consistently captured from 9:00 to 11:00 a.m. to maintain stable solar angles. A GoPro camera with fixed automatic exposure and white balance settings ensured acquisition consistency.
To address potential variations due to solar angle, shadows, and local lighting conditions, we applied a multi-step preprocessing procedure adapted from Xue et al. (2025). This included automatic white-balance correction based on Retinex theory and the gray-world assumption, followed by semantic segmentation of building facades using DeepLabv3+ with a ResNeSt-101 backbone (IoU = 0.85; F1 = 0.91). Non-target elements (sky, vegetation, pedestrians) were removed, and HSV pixels with brightness below 20 were excluded to limit shadow effects. Together, these steps minimize illumination bias and provide a stable basis for color-feature extraction.
Urban architectural color features were extracted through a three-step workflow: preprocessing, façade segmentation, and clustering (Figure 3). White-balance correction reduced illumination bias, segmentation removed non-target elements, and K-means clustering condensed pixels into dominant HSV-based color palettes. Pixels were assigned to dominant hues, reducing redundancy while preserving perceptual relevance. We tested k = 5, 10, and 20, and identified k = 10 as optimal based on clustering performance and elbow-method validation. Workflow for extracting and quantifying dominant façade colors. (a) color extraction process; (b) Minnan style street example; (c) Neo-European style street example.
From the clustered outputs, four color indicators were derived following Xue et al. (2025). Saturation and value capture chromatic intensity and luminance. Hue was decomposed into color complexity, reflecting compositional richness, and color harmony, measuring coherence among hues. Harmony was quantified using ΔE distances in CIELAB space, a standard metric for perceptual color similarity. Together, these measures translate raw color information into perceptually and computationally meaningful descriptors of urban architectural color. The formulas are listed below
Architectural style classification via deep learning
In this study, architectural style is defined as street-scale façade typologies distinguished by salient visual cues. These cues derive from building form, façade composition, and decorative expression, including rooflines, symmetry, window-to-wall ratios, materials and textures, and the presence and complexity of ornamentation (Chesné and Ioannidis, 2024; Nasar, 1994). Based on field surveys, local heritage controls, and design guidelines—supported by expert visual interpretation—we identified five representative styles on Meizhou Island: Minnan traditional, modernist, pragmatic Chinese residential, Neo-European residential, and functionalist architecture (see Table S1). All façade images were independently annotated by two architectural researchers, yielding high inter-rater agreement (Cohen’s κ = 0.76), which confirms classification reliability. The sample comprised 31 Minnan, 8 modernist, 35 pragmatic, 24 Neo-European, and 4 functionalist façades.
Style identification followed a deep-learning workflow (Figure 4). Semantic segmentation first isolated building façades from street-view imagery, excluding non-architectural elements. The resulting images were then classified using a VGG-16 convolutional neural network fine-tuned via transfer learning. This architecture effectively captures morphological features such as roof forms, façade articulation, and ornamental patterns, while transfer learning enhances performance under limited labeled data. Model accuracy was evaluated using a validation set; confusion matrices and precision, recall, and F1 scores are reported in Table S2. Although class imbalance affects some metrics, the integrated segmentation–CNN approach offers a scalable and objective means of distinguishing architectural styles across heterogeneous urban settings (Xu et al., 2023). VGG-16 convolutional neural network for architectural style classification.
Urban form indicators
Urban form indicators based on the 3D framework.
Snow Chinese natural language processing
We analyzed sentiment in geotagged social media texts using SnowNLP, an NLP toolkit designed for Chinese text (Figure 5). SnowNLP leverages pre-trained sentiment dictionaries tailored to online discourse in China, enabling accurate classification of public emotions in localized urban contexts. To improve contextual sensitivity, the standard sentiment lexicon was augmented with domain-specific terms frequently associated with urban spaces and everyday expressions. This enhancement improved the model’s ability to capture nuanced semantics and reflect the affective dimensions of public perception. The schematic diagram of the sentiment analysis using SnowNLP.
OLS regression
OLS model provides a robust framework for estimating the linear relationship between multiple predictors and a single outcome. In this study, we used OLS to assess how urban color and architectural style influence PPSB. The model takes the following general form
XGBoost
To further investigate the relationship between urban color, architectural style, and public perception and spatial behavior, we employed the XGBoost algorithm for predictive modeling. XGBoost is a scalable, high-performance implementation of gradient boosting, well-suited for capturing complex nonlinear relationships in large datasets
Formula (1) formalizes the objective function, where
Data acquisition
Data sources and acquisition methods.
Results
Spatial distribution of visual features and public responses
Spatial distribution in color, style, and form
Figure 6(a) in the Supplemental Material shows urban color clustering on Meizhou Island. High color complexity and saturation concentrate around the tourism core—especially the Mazu Temple and North–South Street—signaling vibrant heritage zones. High harmony dominates new residential areas and renovated landscapes, reflecting coordinated color planning under regeneration. High value appears sporadically in coastal zones and public squares, where sunlight and openness prevail. Kernel density surfaces reveal sharp contrasts: tourist sites are vivid and complex, while fringes favor simpler, more cohesive palettes. Spatial distribution of color, style, and urban form. (a) Distribution of color indicators: Complexity, harmony, saturation, and value. Higher values represent greater intensity of each color attribute. (b) Distribution of architectural styles: Five architectural styles within the study area. (c) Distribution of urban form indicators: Ground Space Index (GSI), Floor Space Index (FSI), Spacemate, Land Use Diversity (LD), Function Diversity (FD), Block Size (BS), and Street Connectivity (SC). Higher values indicate a stronger presence of each urban form feature.
Figure 6(b) in the Supplemental Material maps architectural styles. Pragmatic Chinese residences form a north–south spine, reflecting cost-driven housing demand. Neo-European forms cluster near scenic nodes in the southwest, projecting a symbolic appeal for tourists. Minnan Architecture survives in scattered coastal and cultural sites, showing its marginalization in modern growth. Modernist clusters anchor gateways and civic hubs, while Functionalist types, rare and institution-focused, flank schools and government buildings.
Figure 6(c) in the Supplemental Material depicts urban form. Floor and ground space indices peak on the northern coast and central districts, with Spacemate ratios marking a dense, compact spine. Southern and western areas remain low-rise, creating a north–south gradient. Functional diversity concentrates near cultural and commercial hubs, while peripheries stay monocultural. Eastern cores exhibit tight street connectivity and walkable grids; the south favors large, regular blocks, underscoring tension between inherited forms and modern planning.
Spatial variation in public perception and spatial behavior
Figure 7 illustrates the spatial variation in public perception and behavioral intensity across Meizhou Island, revealing structurally differentiated patterns. High perceptual scores cluster along the northern coast and at cultural landmarks—most prominently the Mazu Temple and seafront promenades—forming a clear “north-strong, south-weak” gradient. Accessibility and scenic appeal amplify visual engagement, generating abundant positive sentiment online, a classic “landmark amplification effect.” Southern interiors and western fringes score low, lacking strong cues or narrative visibility. Spatial variation of public perception and behavior based on kernel density. (a) Kernel density of public perception. Higher values indicate areas with stronger emotional responses and higher public perception. (b) Kernel density of spatial behavior. Higher values indicate areas with more frequent visits and higher spatial activity, reflecting stronger behavioral engagement.
Behavioral activity concentrates along the southern coastline, especially Golden Beach and nearby open spaces, forming a tourism-driven core shaped by landscape amenities and spatial permeability. Moderate activity is observed in central neighborhoods and the northern spine, characterized by mixed uses but with weaker digital resonance. Inland hills remain behavioral voids.
Perception and behavior often diverge; visually striking zones may attract little physical engagement, while unremarkable areas sometimes draw high activity. These mismatches reflect spatial layout, content legibility, and sharing norms, underscoring the multi-layered dynamics of public response.
OLS regression results
OLS regression results: effects of color, style, and form on perception and behavior.
***p < 0.001, **p < 0.01, *p < 0.05.
Across all specifications, color-related components exhibit stronger and more stable associations with public responses than architectural style. Style remains positively associated with both outcomes, but its coefficients are comparatively small.
In the perception models (Models 1), Color_PC2 shows a consistent and positive effect on emotional ratings, indicating that richer hue composition and contrast enhance perceptual appraisal. By contrast, Color_PC1 displays weaker and less stable associations. This suggests that perceptual responses are more sensitive to color structure than to overall intensity.
In the behavior models (Models 2), Color_PC1 exhibits a significant negative association with behavioral intensity, while the effects of Color_PC2 are weaker and less consistent. These results point to a divergence between perceptual appeal and actual spatial use. Architectural style remains positively associated with behavior, though with modest effect sizes. Among urban form variables, functional density shows a robust positive effect across models.
Given the spatial nature of the data, residual spatial autocorrelation was assessed using Moran’s I and found to be significant. Spatial lag and spatial error models were therefore estimated as robustness checks. The key coefficients retain their signs and relative magnitudes, indicating that the OLS results capture substantive relationships rather than artifacts of spatial dependence. Detailed spatial regression results are reported in the Supplemental Material Table S3.
XGBoost results and SHAP-based interpretation
The OLS results reveal a clear linear relationship between architectural color and public perception, identifying the primary effects. In contrast, the XGBoost model uncovers non-linear interactions, adding depth to our understanding by capturing more complex relationships that OLS overlooks (Figure 8). This model highlights key patterns, such as the heightened sensitivity of behavioral responses to color complexity and hue contrast, which remain hidden in simpler linear models. Together, OLS provides a foundational understanding of the linear effects, while XGBoost exposes intricate, non-linear dynamics, offering a fuller picture of how architectural color influences public perception and behavior. XGBoost prediction results and SHAP interpretation for perception and behavior. (a) XGBoost model results for perception. (b)XGBoost model results for behavior.
Figure 8 presents the results from the XGBoost models and corresponding SHAP analyses, identifying the key visual and spatial variables shaping PPSB.
In the perception model, color complexity, block size and function diversity emerged as the most influential features. Color complexity received an importance score of 81 in XGBoost and ranked second in mean SHAP value, just behind block size. This underscores its salient role in shaping emotional perception. Function density also showed consistently high importance, reinforcing the impact of spatial function mix on perception. Among color metrics, average saturation and color harmony also registered strong effects—saturation displayed negative SHAP values at higher levels, while color harmony exerted a stable positive influence. These patterns suggest that perceptual appeal depends not just on visual intensity, but on chromatic balance.
In the behavior model, block size and color complexity remained the top predictors, with importance scores of 175 and 116, respectively. The SHAP dependence plot for Color Complexity revealed a nonlinear, U-shaped pattern: moderate levels of complexity were associated with higher engagement, while both overly simple and overly complex environments appeared to suppress activity. Function density, though ranked slightly lower, maintained high SHAP values, reinforcing its role in supporting behavioral participation. In contrast, style consistently ranked low in both models, indicating that stylistic variation has limited predictive value compared to more immediately visible or spatially salient features.
In addition, SHAP interaction effects further revealed distinct nonlinear dynamics. In low-complexity settings, larger blocks produced stronger negative SHAP values—suggesting that expansive yet visually plain spaces reduce perceptual and behavioral appeal. Conversely, the interaction between color complexity and function density remained positive and stable, indicating that functional richness consistently enhances the influence of color.
Overall, the SHAP results align closely with regression findings. Color complexity acts as a robust and nonlinear driver of both perception and behavior, with effects conditioned by morphological context. Style, by contrast, plays a secondary role, contributing more to aesthetic variation than behavioral motivation. These findings highlight the need to treat color not merely as a decorative element, but as a central component in shaping public experience and guiding spatial use—particularly within visually and culturally sensitive urban environments.
Discussion
Differentiated mechanisms of color, style, and form
The results show that architectural color exerts a stronger influence on PPSB than architectural style. Architectural color variables shape both perception and behavior, though through distinct channels. High levels of color harmony and complexity significantly enhance visual pleasure and environmental evaluations, echoing that coordinated color schemes strengthen restorative effects and affinity (Gu et al., 2025). Style, though positively associated with PPSB, shows weaker and less consistent effects, particularly in behavioral models. This suggests that style relies more on cultural decoding and symbolic association than on immediate affect (Xu et al., 2023).
Color and style also diverge in how they affect perception versus behavior. Color complexity enhances perceptual engagement but may inhibit behavioral activity, likely due to cognitive overload from excessive visual stimuli(Chen et al., 2024; Tuch et al., 2009). This aligns with the SOR model, which posits that visual stimuli (the “stimulus”) trigger cognitive and emotional reactions (the “organism”), ultimately shaping responses (the “response”) (Madan et al., 2018; Mehrabian and Russell, 1974). Excessive complexity, in particular, could induce overstimulation, leading to reduced engagement. Similarly, arousal theory suggests that moderate stimulation increases engagement, whereas too much leads to fatigue (X. Chen et al., 2024). In contrast, style exerts stable and positive behavioral effects but limited perceptual impact. This indicates that style influences behavior through cognitive mechanisms—such as cultural decoding and symbolic association—rather than immediate emotional responses. Combining SHAP analysis with regression results, the marginal contribution of style to behavioral responses is higher than its role in perceptual evaluations, implying that its mechanisms depend more on cognitive processing and cultural context rather than immediate emotional arousal. This aligns with Xue (2025)’s argument that style perception is shaped through cultural imagery, and echoes Yosifof and Fisher-Gewirtzman (2024)’s account of semantic embedding and cognitive involvement in style recognition.
In addition, the interaction analysis reveals that urban form variables interact with color features, shaping their effects on PPSB. Specifically, large Block Size weakens positive perceptions under low-complexity conditions, while Functional Diversity provides consistent support across all complexity levels, strengthening the perceptual expressiveness of color features. This structural moderation mechanism confirms the foundational role of morphology in shaping perceptual responses and supports the necessity of incorporating urban form into multidimensional perception models (Chen et al., 2022; Kasraian et al., 2021; Lu et al., 2025).
Taken together, color operates primarily through immediate and emotion-driven pathways, whereas style reflects symbolic attributes and cognitive processing. The two show clearly differentiated mechanisms in shaping PPSB, and both are conditioned by urban form. Understanding these nuanced, multi-layered mechanisms offers valuable insights into the role of visual elements in urban settings—particularly within visually and culturally sensitive urban environments. This knowledge can inform urban design practices, highlighting the importance of integrating both color and style to better align with perceptual experiences and cultural contexts.
Implications for perception and behavior of urban regeneration
Compared with architectural style, color exerts a more immediate, affective, and spatially consistent influence, offering direct leverage for regeneration-oriented design. The findings show that spatial differentiation in color use must align with visual experience with spatial function. In tourism cores and dense commercial corridors, excessive color complexity—especially highly saturated or strongly contrasting tones—can trigger perceptual overload and dampen behavioral engagement. Along main roads and public nodes, a coherent “color axis” can structure façades, complemented by periodic visual anchors to enhance legibility without inducing fatigue (Sousa et al., 2023). In peripheral and transitional neighborhoods, localized palettes rooted in traditional colors or natural hues can strengthen place perception and community identity (Gu et al., 2025). Overall, a zone-based color strategy reveals how color operates simultaneously as a sensory stimulus and behavioral cue, with context-dependent effects across urban settings.
Additionally, while style plays a weaker role in perception, it exerts stable, positive effects on behavior. This suggests that style can influence behavioral decisions through mechanisms such as cognitive processing, cultural decoding, and contextual adaptation (Xue et al., 2025; Yosifof and Fisher-Gewirtzman, 2024). Accordingly, in historic districts and cultural landscapes, consistency of stylistic expression and continuity of historical context appear to support behavioral engagement. Attention to façade materials, window proportions, and decorative languages may contribute to perceptual coherence in the experience of urban character. In newly built or regenerated areas, a measured diversification of styles—integrating modern and traditional vocabularies—can enhance symbolic richness and behavioral appeal, while avoiding aesthetic fragmentation.
Finally, the interaction results highlight how urban form conditions the effects of visual variables. Functional diversity shows consistent positive synergy with color complexity, suggesting that a rich mix of functions strengthens visual engagement. In contrast, large, monotonous blocks may dampen perception under visually simple conditions, calling for façade articulation or meso-scale spatial punctuation to restore experiential depth. These insights support a more integrated perspective on visual design—linking color, style, and form into a composite strategy that enhances perception and supports behavior across varied urban contexts (Chen et al., 2022; Kasraian et al., 2021).
Conclusion
This study proposes an integrated framework to examine how architectural color and style shape PPSB. By combining street-view image analysis, urban form metrics, and user-generated content, it bridges visual attributes with subjective and behavioral responses at the city scale. The findings demonstrate that color exerts a significant and more immediate influence than style. Indicators such as harmony and complexity correlate positively with emotional perception and spatial participation, pointing to color’s direct, affective role in shaping urban experience. In contrast, style exhibits weaker but more context-dependent effects, operating through symbolic recognition and cultural interpretation—slower pathways that engage cognition more than emotion.
Urban form conditions these effects. Block size, connectivity, and functional mix interact with visual variables, enhancing or dampening their impacts. These spatial interactions highlight the importance of reading form and visual cues together, rather than in isolation. Compared with prior studies that isolate single visual features, this research underscores the value of multi-variable modeling in unpacking perceptual mechanisms and guiding design decisions.
Methodologically, the framework advances perceptual urbanism by fusing automated visual recognition with crowdsourced sentiment and behavior data. It demonstrates the interpretability and scalability of computable visual metrics and offers a replicable approach for cross-regional studies. By translating visual features into behavioral insights, the study provides empirical grounding for perception-oriented and culturally sensitive urban regeneration.
This study has several main limitations. First, the case of Meizhou Island, a tourism-oriented heritage setting, may constrain generalizability, while social media data are subject to selection and expression biases. Second, unobserved factors such as building function may also shape perception and behavior and should be incorporated in future work. Despite these constraints, the findings reveal robust links between visual form and public response, providing a foundation for causal inquiry into how urban color, style, and morphology shape perception and behavior.
Supplemental material
Supplemental Material—Color or style, which one matters more? Testing the influence of spatial characteristics on visitors’ perception and behavior via CNN-XGBoost
Supplemental Material for Color or style, which one matters more? Testing the influence of spatial characteristics on visitors’ perception and behavior via CNN-XGBoost by Di Yang, Rongwei Huang, Peifeng Yang, Qiuyi Zhang, Liyun Huang, Jinliu Chen in Environment and Planning B: Urban Analytics and City Science
Footnotes
Author contributions
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by the [Key Laboratory of New Technology for Construction of Cities in Mountain Area, Ministry of Education, Chongqing University], grant number [LNTCCMA-20240101];the [Youth Foundation for Humanities and Social Sciences Research, Ministry of Education, China], grant number [24YJCZH370] and [22YJC840041]; the [Natural Science Foundation of Fujian Province, China], grant number [2025J01373]; the [Natural Science Foundation of Fujian Province, China], grant number [2025J01372]; the [Scientific Research Startup Foundation of Fujian University of Technology], grant number [GY-Z21016]; the [Project of China-Portugal Joint Laboratory of Cultural Heritage Conservation Science], grant number [SDYY2303] and [SDYY2411]. [Advance Research Program of National Level Projects in Suzhou City University], grant number [2024SGY009]; the General Projects of Philosophy and Social Science Research at Colleges and Universities in Jiangsu Province, grant number [2025SJYB1085]; the [Fujian Key Laboratory of Island Monitoring and Ecological Development (Island Research Center, MNR)], grant number [2024ZD02]; Humanities and Social Sciences Planning Research Fund of the Ministry of Education [23YJA630117]; Social Science Fund Project of Fujian Province, China [FJ2023B100].
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
The data analyzed during the current study are available from the corresponding author upon reasonable request
Supplemental material
Supplemental material for this article is available online.
Author biographies
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
