1 Introduction
The rapid advancement of global urbanization imposes persistent environmental pressures. By 2050, nearly 70% of the worldwide population will reside in urban areas
[1], presenting challenges for creating livable cities. Various organizations have initiated efforts to measure the livability of cities in as early as the 1960s; after the concept of "livability" was introduced at the United Nations Conference on Human Settlements (Habitat Ⅱ) in 1996, measuring urban livability has gained extensive attention
[2]. The assessment of urban livability entails both environmental and perceptual factors. While environmental ones (e.g., crime rates, pollution, and accessibility) form the basis of evaluation, they only gain significance when they link to positive experiences like satisfaction and happiness
[3]. Therefore, understanding people's perceptions on space is crucial for making cities livable.
People's perceptions of livability typically stem from their experiences in the physical environment, involving the processing of sensory stimuli, and the integration of memories and experiences into their understanding of spaces
[4]. Perception is a conscious interpretation and elaboration of sensory data, which can shape individual preferences and adaptation to specific settings and environments
[5]. Former studies often quantify human spatial perceptions via questionnaire or interview
[6]~[8], combined with evaluation systems to determine whether a city is "livable" or "unlivable." While these methods have empirical basis, they are inevitably influenced by individuals' perceptions and tend to be time-consuming and complex
[9]. Since the mid-2010s, advances in urban ergonomics have enabled the establishment of description models of people's spatial perception
[10]. Spatial perception technologies (SPTs) such as eye tracking, electromyogram (EMG), electrocardiogram (ECG), electroencephalogram (EEG), and electrocorticography (EDA) provide scholars with objective physiological data that reflect perceptions in different places, thus alleviating the shortcomings of traditional methods
[11]. These technologies can aid in a better comprehension of the factors influencing urban livability.
Existing studies have employed SPTs to explore issues related to urban livability, yet focusing primarily on empirical research
[12] [13]. There remains a lack of systematic reviews and discussions on the development of SPTs, their effectiveness in enhancing livability, and their correlation with livability. This research innovatively links SPTs with multidimensional urban livability indicators. Through a bibliometric review, the authors discuss how SPTs specifically enhance environmental quality, social equity, and psychological well-being in cities, providing actionable pathways for integrating technological innovations into livable urban planning practices.
In this research, spatial perception is defined as the process through which people receive information from the environment via their senses, process these external stimuli in their brains, and ultimately form a knowledge to comprehend the environment. Based on this definition, the authors investigated how SPTs can be used to evaluate and enhance urban livability and addressed the following questions: First, how do individuals perceive and interpret spaces, and what are the theoretical foundations underlying this process? Second, what SPTs have been used in exploring urban livability? And third, how can SPTs be effectively applied to research on livable cities?
2 Data Collection for Bibliometric Review
Web of Science (WoS) core collection and Scopus were chosen for data collection for this review. The literature type in both databases was limited to journal articles and literature reviews, and the language section was limited to English. After careful consideration of the terminology, this research was ultimately conducted using the retrieval string: TS / TITLE-ABS-KEY = ((spatial perception* OR space perception* OR perception of space* OR spatial awareness* OR visual perception* OR vision perception* OR visual sensation* OR smellscape*) AND (urban* OR city* OR liveab* OR livab*) AND (ergonomics OR human factors OR eye tracking OR eye movement OR physiological sensor OR electromyogram OR electrocardiogram OR electrooculogram OR electroencephalogram OR photoplethysmography OR electrodermal activity OR EEG)). A total of 777 articles published between January 1, 1980 and December 31, 2024 were retrieved and then systematically scrutinized in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines
[14]. After merging the duplicate results, 654 articles were obtained. The authors excluded irrelevant articles by reading the title, abstract and full text, applying the following criteria: 1) Contents: each article must focus on urban environments, wherein empirical studies must delineate specific experimental designs and operational procedures, and review articles ought to concentrate on the systematic discussion of technological applications, environmental elements, or measurement metrics. 2) Methods: participants in empirical studies must be exposed to at least one urban environment, either in real or virtual form, such as through actual walks, photographs, videos, or virtual reality (VR) simulators. 3) Indicators: each empirical study must measure at least one indicator of human physiology, cognitive ability, or emotion. After the screening process, 402 articles were ultimately selected for the bibliometric review.
Bibliometric review is a valuable method for evaluating scientific production, as it describes research hotspots, tracks evolutionary trends, and predicts future research directions scientifically, objectively, and quantitatively
[15]. Advances in data and information visualization technology have led to the emergence of many bibliometric review tools, including BibExcel, CitNetExplorer, VOSviewer, and CiteSpace. Compared with other software, CiteSpace's uniqueness lies in its ability to detect and visualize burst terms and betweenness centrality, enabling the identification of emerging trends, fundamental changes, and major turning points within a field
[16]. The 402 articles were imported into CiteSpace 6.4. R1 64-bit, with the time threshold set from 1998 to 2024
① and the slice set as one year. The authors applied the Log-Likelihood Ratio algorithm for clustering analysis and network generation. First, the distribution of retrieved articles from different years, journals, institutions, and countries was analyzed to obtain a general overview of the research field; second, the co-citation clustering network was generated to illustrate the intellectual base; at last, the timeline view of the keywords cluster network and the keywords burst detection were obtained to summarize the research hotspots.
① Among the 402 articles analyzed in CiteSpace, the earliest was published in 1998; therefore, the time threshold for the analysis was set from 1998 to 2024.
3 Results
3.1 A General Overview of the Research Field
The retrieved articles can be encapsulated within three phases (Fig.1), which elucidate the evolutionary trajectory of academic research from a niche exploration to a mainstream focus.
The first phase (1998–2013) represents the period of theoretical frameworks formation. The technological exploration during this period predominantly focused on eye tracking
[17] [18], with experimental designs primarily utilizing static, image-based laboratory scenarios, lacking systematic analysis on the correlations between physiological data and perception. Due to the limited standardization of technical tools and the prohibitive equipment costs, empirical research relied heavily on questionnaire surveys and in-depth interviews
[19] [20]. Although the research during this phase was limited in quantity, the methodological innovations and the conceptual framework of livable cities laid crucial groundwork for subsequent research.
The second phase (2014–2020) represents a period of research paradigm optimization, and the annual publication remained between 10 and 30 articles. Researchers validated the mechanisms through which environmental factors influence spatial perception by expanding sample sizes, refining experimental control conditions, and integrating multisource data. This period witnessed rapid development in SPTs and equipment. Furthermore, interdisciplinary collaboration among Urban Planning, Neuroscience, and Computer Science became increasingly institutionalized. The research settings expanded beyond laboratory to real-world environments, including streets, parks, and communities, reflecting the deepening of human-centered urban environment design principles
[21] [22].
The third phase (2021–2024) represents a period of societal demand transformation, and the annual publication volume reached a peak of 77 articles in 2024. The research focus has shifted from a "phenomenological description" to "intervention design," emphasizing practical applications. Concurrently, diversified technological applications, encompassing VR and augmented reality (AR) simulations of complex urban environments, alongside artificial intelligence (AI)-driven big data analytics, have emerged as new research trends
[23]~[25]. Notably, relevant research has become deeply integrated with pressing societal issues, not only driving innovation in urban planning concepts but also gaining widespread societal recognition and cross-disciplinary support.
3.2 The Co-citation Clustering Network
The co-citation analysis is instrumental in identifying the intellectual base of a research field and tracking the evolution of its research fronts
[26]. In a co-citation cluster, the node size displays the number of citation of a certain article, and the connecting lines represent the relationships between articles
[27]. The co-citation network of this research (Fig.2) has a modularity Q value of 0.9018 and a silhouette value of 0.9555, indicating that the clustering effect is significant. The resulting clusters reveal three distinct domains of the current foundational research: visual perception and urban environments (i.e., street view images, eye tracking, visual comfort); multisensory environments and health (i.e., soundscape, human health, outdoor thermal comfort); and neuroscience and cognitive mechanisms (i.e., neuroscience, perception).
With accelerating urbanization and increasingly complex environmental challenges, academic inquiry has shifted from single-factor environmental analyses to multidimensional, interdisciplinary systematic research. The intellectual base of spatial perception is rooted in the co-evolution of theoretical construction, technological innovation, and methodological advancement
[9] [22] [28]~[32]. The retrieved articles embed human–environment interaction theories within the core framework, integrating multidisciplinary perspectives from Psychology, Environmental Science, Neuroscience, and Computer Science. These articles aim to elucidate the complex mechanisms through which urban environments affect human perception, health, and behavior via multisensory channels, thereby providing scientific underpinnings for sustainable urban design.
From a theoretical perspective, Cognitive Psychology and environmental behavior theories constitute the cornerstone of spatial perception research. The Stress Reduction Theory and Attention Restoration Theory emphasize the positive effects of natural landscapes on psychological stress relief and physiological recovery
[28] [33]. The Soundscape Theory transcends the limitations of traditional noise control by focusing on the emotional and cognitive values of acoustic environments
[34]. The thermal comfort theory, from a biometeorological perspective, examines the impact of outdoor microclimates on human activity adaptation
[35]. These theories collectively establish a multiscale, multisensory framework for interpreting spatial perception, facilitating a paradigm shift from "environmental determinism" to "human–environment interaction theory."
Technological innovation serves as a pivotal driving force in advancing this field. For example, the integration of street view images with deep learning has enabled high-precision quantification of urban physical environments
[30]. The proliferation of eye tracking has provided micro-behavioral evidence for visual perception research. Furthermore, multimodal data fusion has become mainstream: Through integrating audio–visual experiments with physiological indicator measurements, the synergistic impact of audio–visual interactions on urban environmental satisfaction was demonstrated
[36]. These technological breakthroughs have not only enhanced the analytical precision of environmental perception but also bridged cross-scale research, spanning from individual behaviors to urban systems.
At the methodological level, spatial perception research is undergoing a transformation from static description to dynamic analysis, and from subjective evaluation to objective quantification. The introduction of a human–machine adversarial scoring framework has addressed the issue of subjective bias inherent in traditional questionnaire surveys
[9]. By leveraging the complementary strengths of machine learning and human expertise, efficient collection and validation of perceptual data were enabled. The application of dynamic analytical methods also warrants particular attention. For instance, the incorporation of soundscape prediction models and scene semantic parsing technology has translated qualitative descriptions into computable parametric systems
[31]. By delineating the spatiotemporal patterns of visual quality of historical cities by time-series streetscape images and visual entropy calculations, dynamic decision-making support for the preservation of historic neighborhoods can be obtained
[32]. These methodological innovations not only expand the research landscape but also enhance the practicability of scientific findings in urban planning.
3.3 Timeline View of Keywords Cluster Network: Research Hotspots
As illustrated in Fig.3, the current research themes (clusters) can be categorized into three groups: human perception, SPTs and theoretical foundation, and environmental characterization research.
Through the analysis of high-frequency keywords (Tab.1) and burst keywords (Tab.2), the research hotspots can be categorized into three core dimensions: 1) technology-driven methodological innovation, 2) co-evolution of societal demands and policy frameworks, and 3) paradigm shift towards human-centered approaches and complex system analysis. The three interrelated dimensions create a research continuum that spans from theoretical exploration to practical implementation, driving the field development.
1) Technology-driven methodological innovation. The proliferation of portable sensing devices and VR has enabled dynamic perception data collection in real-world environments
[37]. Deep learning models have revolutionized the quantification of spatial attractiveness through automated image processing
[9] [31]. Multimodal data fusion has transcended the limitations of single-sensory studies, and facilitated comprehensive system analysis across the "human–technology–environment" continuum
[38]. The intersection of Neuroscience and Urban Planning has given rise to a theoretical framework that employs functional magnetic resonance imaging (fMRI) and EEG to decode neural responses to urban landscapes
[39], thus interpreting traditional design theories in terms of verifiable physiological models.
2) Co-evolution of societal demands and policy frameworks. The interplay between global challenges and Sustainable Development Goals proposed by United Nations (UN) is fundamentally reshaping the research priorities. The COVID-19 pandemic spurred interests around therapeutic landscapes
[40], while normalization of remote working has heightened residents' expectations regarding the quality of living environments and public spaces. In response to more frequent extreme weather events, thermal comfort research has shifted from building-scale investigations to urban system-level analyses
[13], making climate adaptability of green infrastructure a focal point of empirical research
[41]. International programs have incorporated built environment studies into global climate action agendas by UN through quantifiable metrics
[42]. Smart city initiatives have demonstrated efficient translation of research outcomes into urban governance tools
[43]. The "problem identification–technological innovation–policy response" loop ensures dynamic alignment between academic exploration and governance needs, exhibiting robust integrative capabilities in addressing complex challenges.
3) Paradigm shifts towards human-centered approaches and complex systems analysis. The current research paradigm is strategically transforming from single-element analysis to the modelling of synergistic "human–nature–technology" systems. This shift manifests in two principal dimensions: first, the population aging and the rights advocacy for vulnerable groups have propelled the advancement of inclusive design theory
[44]; second, complex system analysis has evolved along scale-coupling mechanisms, spatiotemporal dynamics leverage, and ecological environment assessment.
4 Discussions
The interaction between human, technology, and the environment presents mutually reinforcing relations, forming a triadic conceptual model for research on livable cities from the perspective of spatial perception (Fig.4). To further discuss the proposed questions and strengthen the theoretical foundation for livable city development, this section elaborates on the principles of spatial perception, SPTs, practical applications, and the future research.
4.1 Principles of Spatial Perception
Designing livable cities emphasizes both improving the physical environment and understanding the dynamic interaction between human perceptions and environmental factors. Although the meaning of livability may vary among individuals, its essence lies in creating comfortable, safe, and pleasant environments that elicit positive emotional responses and behavioral reactions, thereby fostering a sense of spatial belonging and social identity. Since the 1960s, scholars have attempted to develop measurable indicators to thoroughly assess the livability of a specific city
[45]. However, the indicators may not fully reflect subjective experiences.
Cognitive Psychology provides theoretical underpinnings for analyzing spatial perception process, which can be conceptualized as a tripartite process comprising sensation, cognition, and decision-making according to cognitive theory
[46][47] (Fig.5). In the sensation stage, the organs receive external stimuli and perform preliminary processing of the input spatial information
[48]. In the cognition stage, the brain encodes and stores information from the senses with the help of existing knowledge and experience, and it compares and identifies the information with the existing cognitive schema to form a comprehensive reflection of external stimuli
[49]. Human cognition is not a computational and coding process that separates the mind and the body, and embodied cognition theory emphasizes the importance of the body and environment in the cognitive process
[50]. Therefore, the sensation and cognition stages are usually intertwined in environmental cognitive processes, contributing to human perceptions of urban spaces
[51].
Underpinned by cognitive functional theory and Maslow's hierarchy of needs theory, the significance of spatial perception resides in its capacity to satisfy fundamental human requirements
[52]. In the decision-making stage, the cerebral evaluation of perceived spatial attributes against these needs triggers valence-specific emotional responses that subsequently drive behavioral adaptations
[53]. Within this perception–emotion–behavior continuum, emotions serve as both direct outcomes of spatial perception and crucial mediators of spatial behavior regulation. Post-action outcomes subsequently feed back into renewed perceptual cycles—exemplified by how affective amelioration through smiling may engender more positive environmental appraisals
[54]. This cyclical mechanism elucidates the dynamic interplay between spatial cognition, affective states, and behavioral patterns in urban contexts.
Quantifying human perceptual responses constitutes the prerequisite of urban livability research. Contemporary studies have systematically clarified the intricate relationships between spatial perception and livability with multidimensional intervening variables
[40], including emotion, stress, preference, satisfaction, comfort, perceived safety, perceived happiness, and nature-relatedness.
4.2 SPTs Used in Urban Studies
SPTs play a crucial role in data collection and processing during experimental design and implementation, thereby advancing in urban environmental research. The spatial perception process involves intricate neurophysiological and biochemical reactions, which ultimately manifest as multistage, multilevel psychological, physiological, and behavioral changes. These changes constitute essential biomarkers for assessing spatial perception. Experimental psychology methods are widely applied to investigate the causal relationships between urban environmental characterization and intervening variables through controlled experiments. In urban studies, the variables can be categorized into environmental and human perceptual parameters (Fig.6).
The environmental parameters include both objective and subjective elements. The objective elements can be obtained through multisource data collection, while the subjective ones can be measured with questionnaires.
The spatial perception process involves intricate neurophysiological and biochemical reactions that lead to multistage and multilevel psychological, physiological, and behavioral changes. These changes constitute essential biomarkers and can be categorized into three dimensions of perceptual parameters—psychological (i.e. cognitive evaluation), physiological (i.e. neuroendocrine responses), and behavioral (i.e. spatial behavior patterns) indicators.
1) Psychological indicators. Questionnaires, interviews, and cognitive maps are commonly used to measure psychological indicators. Questionnaires can be qualitative, focusing on self-reporting and open-ended responses; or quantitative, emphasizing standardized psychological measurement tests
[37]. Interviews are oral retrospective statements conducted in in-depth, semi-structured
[55], or structured formats
[56]. However, psychological indicators measured by questionnaires and interviews are inevitably limited due to memory bias. As people learn about the environment, they create a field map in minds that reflects their cognitive and behavioral abilities in space
[57]. Therefore, cognitive maps are eligible for recording the image of the city. Kevin Lynch proposed this set of methods as early as the 1960s
[58]. Due to the need for high-quality responses and the high cost of data analysis process, cognitive maps are commonly combined with questionnaires or digital technologies
[59]. Through in-depth analysis of subjective experiences at both individual and group levels, psychological indicators can unveil the complex perception mechanisms. The complementary use of qualitative data and quantitative data addresses the issue of low data robustness due to relying solely on physiological signals
[60].
2) Physiological indicators. The unclear connection between emotional experiences and spatial data, as well as the difficulty in quantifying perceptual experiences, makes it challenging to accurately assess individuals' true perceptual states or identify key spatial elements. To address these limitations, real-time and high-precision physiological indicators have been introduced, along with the development of cognitive neuroscience. It determines the level of emotional arousal by detecting physiological signals from the object. EEG records electrophysiological signals generated by the electrical activity of neuronal clusters in the cerebral cortex, reflecting changes in arousal levels during spatial perception, assessing cognitive load intensity, and revealing the emotional valence of human–space interface interactions
[23]. EDA can accurately reflect the immediate emotional arousal and stress response duration induced by environmental stressors by measuring changes in skin conductivity
[61]. ECG can reveal the dynamic balance of the autonomic nervous system through heart rate variability (HRV) spectral analysis
[62]. EMG can detect changes in skeletal muscle electrical signal intensity, focusing on analyzing increased unconscious muscle tension caused by ergonomic design flaws in urban environments and defensive motor responses triggered by the perception on dangerous spaces
[37].
3) Behavioral indicators. Eye tracking data can reveal the distribution of visual attention on spatial interfaces, which is commonly employed in studies of landscape aesthetic preferences and the effects of visual stress recovery
[63]. Positioning technologies, such as Wi-Fi probes, GNSS, and UWB, can capture real-time coordinates, movement directions, dwell times, and motion trajectories of individuals. When combined with trajectory clustering algorithms, the technologies help extract representative pedestrian flow routes and measure the built environment of streets
[64]. Photography and video, integrated with deep learning technology, enable the detailed observation of the relationship between behavioral decisions and spatial characteristics, breaking the temporal cross-sectional limitations of behavioral research. This approach focuses on the impact of temporal changes in environmental factors on behavior within fixed contexts, and supports longitudinal tracking studies across temporal dimensions
[65]. Facial expression recognition captures micro-movements of facial muscles, changes in facial features, and the duration of expressions, and generates emotional heatmaps and time-series change curves. This allows analysis on the emotional fluctuation pattern during spatial experiences
[66].
Technological innovation in spatial perception research is primarily reflected in the deep integration of multimodal data and the collaborative application of intelligent algorithms. Core methodologies include spatiotemporal data overlay analysis and machine learning-driven predictive modelling. For instance, through spatial interpolation analysis on GIS platforms, physiological indicators can be matched with urban morphological parameters, thereby identifying the spatial distribution patterns of "stress hotspots" and their morphological causes
[67]. The overlay of street view images and air quality monitoring data enables real-time monitoring and prediction of the spatial distribution of urban air pollution, providing data support for environmental governance
[68]. The dynamic coupling of street view images and GPS trajectory data offers a scientific basis for planning personalized routes and enhancing the quality and safety of outdoor activities.
The introduction of machine learning algorithms has significantly enhanced the predictive accuracy and explanatory power of spatial perception research, providing innovative methodologies for analyzing complex spatial systems. Recurrent neural networks have been employed to simulate urban dynamic soundscapes over extended periods, incorporating optimal temporal dimensions for urban soundscape cognition sustainability, thereby predicting their impacts on human perception
[69]. This advancement signifies a paradigm shift from single-factor analysis to complex systems research. Existing studies have utilized machine learning models (e.g., support vector machines
[29], self-organizing maps
[10]) to construct mapping relationships between spatial variables and multidimensional indicators, achieving precise prediction on integrated effects of multiple spatial elements. A notable study leveraging multisource big data from New York City, the USA, quantified and modelled spatial elements and place characteristics across more than 100 representative sites, employing classification and regression analysis to identify dominant drivers of pedestrian activities in different spatial typologies and systematically elucidate the influence of urban environmental features and social dynamics on human activities
[70].
In recent years, spatial perception research has evolved from singular environmental descriptions to multidimensional cognitive analyses. The methodological approaches—correlation analysis, mechanism exploration, and predictive simulation—have been applied for livable urban environment studies. Correlation analysis, as a foundational paradigm, employs statistical models to quantitatively reveal relationships between environmental factors and perceptions. For instance, multinomial logistic regression models have been applied to identify differences in urban residents' visual attention toward ethnic landscapes
[71]; structural equation modelling has been utilized to decipher the complex mechanisms through which urban environmental elements influence soundscape perception
[72].
Mechanism exploration integrates multimodal data and interdisciplinary theories to uncover the underlying mechanisms of individual spatial perception. Recent research has innovatively proposed a VR-based eye tracking data collection method using panoramic videos, which successfully establishes a large-scale dataset of eye movements in dynamic scenes and demonstrates that local entropy can effectively measure visual information density in built environments
[73]. Such findings provide architects with an objective metric to assess the visual attractiveness of various spatial interfaces, and guide visual attention in public spaces through detailed design
[73].
Predictive simulation employs statistical models, machine learning, and deep learning technologies to analyze historical or sampled data to forecast future scenarios. The breakthrough in deep learning models for predicting visual attention distribution has equipped urban planners for identifying visual focal points in urban landscapes, supporting spatial optimization and visual appeal enhancement
[74]. The convolutional-recurrent neural network model, trained on a specialized dataset, can accurately predict the spatiotemporal distribution of visual attention in urban environments, providing reliable references in practical engineering projects
[73].
4.3 Practical Applications
Creating a livable urban environments requires systematic endeavor that integrates physical spaces, urban society, and human perceptual needs, recognizing their complex interconnections and mutual influences across scales—from macro urban structures to micro architectural units. To explore the potential of STPs in practical applications, this research classified the urban environment into built enironments, public spaces, roads and transportation facilities, underground spaces, and infrastructure, based on the functional characteristics and spatial layouts (Fig.7).
Research on the built environments focuses on spatial layouts, facade design, and the interaction between humans and buildings at multiple scales. At the urban scale, moderate to high overall illumination levels in urban parks can enhance accessibility while minimizing cognitive load
[75]. At the street scale, textual signage, street interfaces, and billboards have been examined to assess the impacts on residents' visual appeal and comfort levels
[63]. At the architectural scale, detailed design should maintain environmental coherence, and the window dimensions should be enlarged to improve landscape visibility, enhance stress recovery, and promote positive effects on mental health
[24]. Although such studies are limited in quantity, their diverse topics and unique perspectives provide multidimensional theoretical support and practical insights into how the built environment can enhance urban livability.
Public space, serving as a core type of recreational and leisure sites within urban environments, has become a research hotspot due to their rich inclusion of natural elements. Commonly discussed intervening variables are stress, restorativeness, and visual preferences. Related studies have explored the mechanisms through which landscape elements (e.g., soundscapes
[62], biodiversity
[76], vegetation types and density
[77], seasonal variations
[78]) influence individual perceptions. Notably, high-quality public space development relies not only on optimizing natural elements but also on integrating principles of fairness and justice. From the perspective of distributive justice, strengthening heat warning systems and emergency response measures can effectively protect thermally vulnerable groups such as the older adults, children, and outdoor workers
[79]. From a gender-sensitive perspective, adjusting vegetation density, optimizing visual focal points, and improving pathway design can significantly enhance the pedestrian experience for female users
[80].
Research on road and transportation facilities primarily concentrates on travel modes, driving behavior, and supply chain transportation efficiency. In research on travel modes, key factors including infrastructure conditions, traffic environment, personal safety, comfort, and destination accessibility significantly influence cyclists' route selection
[55]. In research on driving behavior, selecting music with a moderate tempo and unfamiliar lyrics can minimize interference with a driver's operational performance
[81]. On supply chain transportation efficiency, eye tracking and interaction data from real-world tests demonstrate that handheld devices excel in precise navigation, whereas in-view displays improve rapid hazard detection
[82].
Underground space has become an inevitable choice for expanding urban capacity. Systematic research has examined the acoustic conditions, lighting, air quality, and landscape elements of underground spaces using multidimensional physiological indicators (e.g., melatonin concentration, pupillary dynamics, EEG), yielding scientific evidence for the optimization of subterranean environmental systems. Relevant studies have revealed that both indoor greenery and artificial windows can positively impact underground environments while effectively enhancing users' motivation
[61]; elevated humidity and CO
2 concentrations may raise skin temperature and trigger physiological symptoms such as headaches and fatigue
[83]; nature-mimicking artificial lighting plays a beneficial role in maintaining human circadian rhythms
[84].
Infrastructure, though less directly encountered in residents' daily lives, is gradually gaining attention for its impact on spatial perception. In audio-visual perception studies of new wind energy facilities, wind parks were found to significantly increase residents' subjective and visual aural annoyance; however, this annoyance primarily stems from residents' attitudes toward the wind parks rather than the sound itself
[85]. Comparative research on physiological and behavioral responses to renewable energy systems across different landscape types reveal a significantly higher preference for natural landscapes over urban settings
[86]. These findings provide valuable insights for planners and policymakers, highlighting the potential value of natural landscapes in renewable energy system and infrastructure planning.
In advancing the development of livable cities, the innovative application of SPTs has provided crucial support for enhancing urban environmental quality
[87]. Experiments utilizing mobile eye-tracking glasses verified the differential impacts of variations in the physical configurations of the street interface on pedestrian visual perception, providing an empirical basis for the assessment-oriented spatial framework of arterial street livability
[63]. Optimizing architectural layouts based on residents' visual preferences can maximize scenic views and align spatial form with behavioral needs
[88]. Healthy community development has benefited from targeted interventions such as noise barriers and green belts, informed by correlation analyses between environmental factors (e.g., air pollution, road network density) and mental health indicators
[89]. As for urban safety and resilience, real-time positioning of trapped individuals via mobile signals and wearable devices, combined with rapid assessment of injury severity through physiological indicators, has significantly enhanced emergency response efficiency
[90].
In terms of design practice, during the design process of Big Air Shougang, the competition venue for freestyle skiing and snowboarding at the Beijing 2022 Winter Olympics, architects simulated the spatial distribution of lakeside walking sightlines and carried out several design iterations by employing a visual attention distribution prediction model. In the post-construction period, eye tracking data confirmed that the redesigned jump significantly improved the lakeside visual experience, validating the effectiveness of the used technical tools.
[91]While designing the National Ski Jumping Centre's Summit Club in Zhangjiakou, China, a virtual viewing behavior test revealed a visual preference of the eastern side of the corridor, particularly towards the valley. Consequently, the architects designed a uniquely eccentric circular indoor space to maximize scenic views
[88].
4.4 Prospects
Given the recent progress of this field, three critical research directions warrant attention.
1) For technological innovation, multimodal data integration can be enhanced with advanced synchronization algorithms and low-cost, non-invasive tools. Generative AI and digital twins could further democratize access to spatial perception analysis. The iteration of metaverse and AR technologies can enhance the authenticity and controllability of virtual experimental environments, providing technical support for dynamic simulation and optimized design of complex urban systems. This will ultimately facilitate a paradigm shift from data-driven descriptive analysis to causal intervention.
2) For research subject, existing studies focus on general groups (e.g., youth and healthy individuals) and often overlook cultural diversity. Expand research inclusivity by covering diverse populations (e.g., older adults, children) and cross-cultural comparisons (e.g., between high-density Asian cities and low-density European cities) to develop universally adaptable design principles. For instance, therapeutic public spaces in China could either incorporate vertical greening elements tailored to East Asian urban preferences, or explore flexible universal design strategies to reconcile conflicting needs among diverse user groups.
3) For practical application, the convergence of virtual and physical environments and the integration with public participation are expected to become a significant trends in future research. Participatory planning through virtual-physical interfaces can enable real-time public feedback and support iterative design processes. Current research still sees inadequate predictive simulation. Future studies should expand sample sizes for empirical investigations while employing AI algorithms to accurately predict individuals' physiological and psychological states across varying environments. The key to achieving these technological breakthroughs lies in fostering deep interdisciplinary collaboration, particularly among neuroscience, cognitive psychology, and intelligent technologies. This integration will facilitate the development of a perception-driven design paradigm for livable cities, transforming urban settings from mere physical constructs into emotionally resonant and socially restorative environments.
5 Conclusions
Against the backdrop of accelerating urbanization and growing societal demands for livable cities, this research synthesizes the evolving role of SPTs in advancing urban livability through a comprehensive bibliometric review. By analyzing 402 articles from the WoS and Scopus databases, the authors identified three distinct developmental phases and highlighted the following results: 1) Theoretically, Cognitive Psychology and environmental behavior theories are the interdisciplinary foundation of exploring the perception–emotion–behavior continuum, where emotions act as critical mediators between environmental elements and behavioral outcomes. 2) Technologically, the proliferation of SPTs, such as eye tracking, EEG, EMG, and VR, has enabled quantification analysis of subjective experiences, bridging gaps between traditional qualitative assessments and livable city design. 3) Practically, the results of applying technologies to establish a human-centered design framework—derived through empirical studies—are valuable in both simulated and real-world scenarios (i.e., built environment, public spaces, transportation, underground spaces, and infrastructure). This research concludes with a triadic model that encapsulates the dynamic interplay among human, technology, and environment, which transcends the traditional unidirectional paradigms of "humans adapting to the environment" or "technology-driven design." It emphasizes bidirectional feedback loops that inform livable urban planning and design, and provides a replicable and extensible theoretical paradigm for livable city research.
This research has several limitations. In terms of data scope and bias, the reliance on WoS and Scopus databases may exclude valuable studies from non-English journals, potentially overlooking culturally specific insights. Future research should incorporate multilingual sources to enhance representativeness. As for temporal constraints, this research captures trends up to 2024, but rapid advancements in AI and sensor technologies may render some findings obsolete. Continuous updates and longitudinal analyses are essential to track evolving paradigms.