1 Introduction
Urban park and green spaces are important public resources for cities and closely related with public health, ecological and environmental quality, and social well-being
[1]. Meanwhile, new social demands are emerging continuously, especially for the rising construction of “Park Cities” and “Forest Cities” in China. This makes how to enhance park service effectiveness to support urban regeneration and renewal a difficult task that park managers must face
[2]. At present, as the rapid development and wide application of new technologies such as big data and artificial intelligence have provided new opportunities for improving the service effectiveness of parks, this paper explores their application in the full-process workflow under an “evaluation–diagnosis–decision-making” framework for enhancing park service effectiveness, aiming to propose an effective method that improves the actual management of urban parks.
2 Literature Review
“Effectiveness” is a term in management academia. Peter F. Drucker believed that effectiveness refers to the ability to select appropriate goals and achieve them
[3]. Chinese scholars generally interpret the connotation of effictiveness as effect, effeciency, and ability
[4] [5]. Referring to the existing research on the connotation of effectiveness
[6], this paper defines “park service effectiveness” as the service effectiveness of a park and the service capability of its park managers. To enhance park service effectiveness, it is necessary to evaluate the existing service effectiveness and capabilities, and make adjustment decisions according to the evaluation results. This requires a clear decision-making cascade, from scientific evaluation system for park service effectiveness and efficient evaluation technologies to effective application of evaluation results.
In the dimension of evaluation system, existing research divided park effectiveness into four categories—ecological, economic, social, and comprehensive—and summarized the evaluation content for each
[7]. Regarding evaluation scales, the macroscopic evaluation research focuses on the accessibility, distribution equity, and vitality of urban parks
[8]~[10], while the microscopic evaluation studies focus on certain park types or individual parks, with a focus on the visitor volume, visitors’ satisfaction, and facility configuration
[11] [12]. In terms of evaluation technologies, UAV (unmanned aerial vehicle) remote sensing, text analysis, geospatial analysis, and big data technology have been widely used
[13]~[16]. Traditional data (e.g., statistical data, questionnaire survey data, remote sensing data) and emerging data (e.g., Internet map data, mobile signaling data, public opinion data) have all supported various types of effectiveness evaluations
[17]~[19]. For the application, although scholars have studied how to inform the decision-making of relevant management departments based on evaluation results
[20], the follow-up strategic recommendations put forward often vary in specificity and comprehensiveness.
In general, although existing studies can provide support from different perspectives for decision-making of park service effectiveness enhancement, most of them emphasize evaluation over application, and the construction of decision-making cascade is relatively weak, especially in the following three aspects. First, there exists a misalignment between park management research and the actual deficiencies. Current research concentrates on the service effectiveness of specific park types and tends to offer generic recommendations, neglecting the need for precise deficiency-finding and targeted solutions of individual parks in daily management. Second, existing studies frequently examine a singular effectiveness category and the data used often struggle to meet the dual demands of comprehensiveness and specificity. Third, actual park management emphasizes timeliness and continuity, but the majority of existing research outcomes are isolated and lagging in effectiveness, and there is scant discussion on how to provide ongoing decision-making support under cost constraints.
In response to the limitations in traditional technical means and approaches, this study, addressing the entire workflow of enhancing park service effectiveness, proposes an “evaluation–diagnosis– decision-making” framework supported by novel technologies. This method innovatively integrates a multitude of technologies, enabling a comprehensive investigation of park service effectiveness and the precise formulation of diagnosis and optimization suggestions, and effectively informing park managers to develop actionable plans.
3 Establishment of the “Evaluation–Diagnosis–Decision-making” Framework for Park Service Effectiveness Evaluation
The “evaluation–diagnosis–decision-making” framework proposed in this study for evaluating park service effectiveness is illustrated in Fig.1. Evaluation, as the initial step, commences with the investigating and monitoring of common data, particularly various types of active sensing data, and subsequently intelligent algorithms such as text and visual algorithms are employed to construct and compute service effectiveness indicators. At the diagnosis stage, subjective diagnoses are used to generate qualitative impressions about the park and then its strengths and weaknesses are identified based on computations and analyses of each indicator. Leveraging knowledge graph technology, the indicators are correlated with respective responsible entities and relevant policies to generate targeted recommendations. Finally, decision-making tailored to actual park management scenarios is supported with automatically-generated periodic evaluation reports that use templates and intelligent technologies. Evaluation reports and related knowledge are integrated into an intelligent question-and-answer (Q&A) service with large language model (LLM), providing park managers with a convenient query tool for decision-making.
3.1 Methods for Investigation and Computation of Park Service Effectiveness
3.1.1 Multi-Sourced Data for Investigating Park Service Effectiveness
To comprehensively cover the data for investigating park service effectiveness, this study categorized commonly used data into different types
[21] (Tab.1). Specifically, park management data refers to data collected, recorded, and analyzed about park operations, environmental conditions, and visitor activities. Social perception data refers to large amount of spatio-temporal data collected through modern information technologies. This type of data is characterized by broad coverage and rapid updates, effectively compensating the data gaps about emerging park usage behaviors and scenarios, such as Citywalk
[21]. Active sensing data refers to detailed data about the built environment of parks, collected in a targeted manner using sensing equipment, which, through algorithmic analysis, enables deficiency-finding of facility damage and shortage like barrier-free or outdoor fitness facilities. The integration and combined use of multi-sourced data provides a robust support for the exploration of park service effectiveness.
3.1.2 Intelligent Algorithms for Indicators
This study does not propose a universal indicator system (a “one-size-fits-all” framework) for park service effectiveness evaluation; instead, it advocates that the indicator system should be tailored to specific evaluation targets, where intelligent algorithms offer adaptable solutions and enable broader-scope, refined computation of indicators
[22]~[24]. Intelligent algorithms in this study are categorized into text-based, visual, and other types
[25]~[34] (Tab.2), and the application of the former two algorithms will be elaborated.
For text-based indicator computations, text algorithms can be used to analyze visitors’ satisfaction level with specific park elements (Fig.2). In contrast to conventional methods that could only differentiate positive and negative emotions, advanced methods such as text embedding, cosine distance calculation, and spectral density algorithms can accurately identify complex emotions in an unsupervised manner
[35]. Furthermore, entity extraction algorithms
[27] are employed to generalize and extract park elements. Based on satisfaction scores and complex emotion analysis derived from text scoring models, specific factors affecting visitors’ emotions and satisfaction are uncovered.
For visual indicators, image data constitutes an important source for evaluation
[36][37] and can be amalgamated with deep learning algorithms to compute indicators across scales. The EfficientTeacher algorithm and the Yolov8 model
[28][38] form the basis of this study, enabling model training with limited samples to identify common targets such as vacant parking lots and barrier-free facilities. Additionally, Zero-Shot algorithms are employed to recognize uncommon targets
[29]. Segmentation models based on Transformer algorithms are applied to compute indicators such as green view index and sky visibility. In comparison to CNN models, which demonstrate proficiency in processing high-frequency data, Transformer-based models can better capture local features of streetscapes, rendering them more appropriate for complex environments such as parks
[39]. Then, the final results of all calculated indicators are formed, providing data support for the generation of subsequent reports (Fig.3).
3.2 Deficiency Diagnosis and Optimization Recommendation
3.2.1 Generation of Park Impressions
With the text-based algorithms, a frequency analysis of keywords in park review texts is undertaken to catalogue high-frequency lists pertinent to individual parks. The lists are then used to generate park impression word clouds, forming a preliminary visualization of the differences among visitors’ focus across parks. To further refine the park impressions, lexicons of high-frequency words (e.g., emotions, seasons, etc.) from all parks are constructed. The text data of visitor reviews for each park are statistically analyzed and categorized by word frequency, and the words ranking top 10 were selected as keywords. By integrating word frequency and the weight of high-frequency words, the score of each keyword can be calculated. Higher-score words are then taken as the core descriptors for generating park impressions and are matched with their respective categories to summarize the preliminary impressions. Finally, these impressions are polished using LLMs to generate more sophisticated statements, such as “happiness is the dominant emotion associated with XX Park” or “autumn is the most popular season for tourists to XX Park.”
3.2.2 Deficiency Diagnosis
The park service effectiveness of individual parks can be ranked by their scores of each indicator, and assessed as levels of superior, average, or underperforming according to their rankings. In addition, the deficiencies of individual parks can be diagnosed by the trends of single indicator results over a given time span. Besides, indicators with established evaluation criteria (e.g., park capacity utilization, restroom density) can be assessed or rated by whether the park met the required benchmarks. It is important to note that, given the inherent differences among varied park types, deficiency diagnosis only conducts comparisons among parks of the same type, and evaluates each park’s relative strengths and weaknesses based on its rankings for specific indicators.
3.2.3 Generating Optimization Recommendations
The generation of optimization recommendations for enhancing park service effectiveness is contingent upon the utilization of knowledge graph technology, which facilitates the connections between indicators, policies, and responsible entities. In comparison to unstructured text data, knowledge graphs, as a structured representation form of knowledge, are capable of describing concepts and their interrelations. The integration of knowledge, data, and their interrelations into a large-scale semantic network enables knowledge graphs to facilitate faster application in computation, interpretation, evaluation, as well as knowledge retrieval and reasoning. The ability of knowledge graphs to effectively process complex network structures renders them particularly well-suited to the generation of optimization recommendations for enhancing park service effectiveness
[40].
Specifically, indicator graphs standardize attributes such as indicator name, definition, hierarchy, and source, and generate a network that maps the relationships between indicator– data entities and indicator–indicator entities. This achieves the integration of the indicator system with multi-sourced data. Policy graphs adopt publicly available policy documents and provisions related to specific indicators as the basis for optimization recommendations. The graph further extracts keywords from policy texts as representations of policy content, and frequently co-occurring keywords are then linked to the same thematic nodes, enabling the semantic decomposition of policy texts and the representation of relationships among associated policies. Responsibility graphs delineate the entities charged with specific implementation tasks and establish a network connecting indicators, leading entities, responsible entities, task details, and authority bases.
Finally, the “indicator–policy–responsibility” knowledge graphs are matched with the relatively weaker indicators to establish linkages between relevant policy texts and responsible entities and produce detailed, textual optimization recommendations, with the aid of text generation technology.
3.3 Report Generation and Decision-Making
3.3.1 Dynamic Batch Generation of Reports
In practice, evaluation results, deficiency diagnoses, and optimization recommendations are often presented in the form of reports. However, the generation of a substantial number of reports for numerous individual parks using conventional methods necessitates considerable human labor. This study proposes an automatic approach to generating park evaluation reports in batches by using templates and intelligent methods (Fig.4). Templates see advantages in high standardization and efficiency, but most existing templates have insufficiency in precision and flexibility when addressing complex tasks
[41]. Under the proposed framework, LLMs are integrated with report templates to enhance the flexibility, specificity, and readability of the generated reports.
3.3.2 Intelligent Q&A Service
To more effectively address park managers’ actual needs in the decision-making process, a more user-friendly, efficient, and customized interaction method is provided through a Q&A manner. First, a suitable general-purpose LLM is selected for Q&A training, in conjunction with the construction of a localized knowledge base for the enhancement of park service effectiveness, which encompasses basic park information, facility descriptions, activity schedules, evaluation results of different indicators, and reports. To address the issues of inaccurate content generation, isolated combination of knowledge base, and difficulties in representing interrelations due to insufficient understanding of context, the research team introduced a knowledge-first preference alignment method, which integrates embedding models, rule-based templates, and recommendation algorithms to retrieve relevant information from the knowledge base, and realizes a Q&A service system that provides accurate and detailed responses tailored to park service effectiveness.
4 Empirical Research on the “Evaluation–Diagnosis–Decision-making” Framework for Park Service Effectiveness Enhancement
4.1 Research Overview
Beijing Municipality has been conducting park service effectiveness evaluations since 2022. The evaluations cover the seven types of parks outlined in the Beijing Park Classification and Management Measures (revised in 2022), including comprehensive parks, community parks, historical gardens, specialized parks, recreational gardens, ecological parks, and natural parks. By the end of 2023, evaluations for a total of 364 parks across all districts in the city have been completed by employing the “evaluation– diagnosis–decision-making” framework proposed in this paper, realizing evaluations of park service effectiveness at different time periods and informing decision-making for park service effectiveness enhancement at the city scale.
4.2 Application Process
4.2.1 Construction of Evaluation Indicator System
The research team initially conducted an analysis of citizens’ demands, and then constructed a preliminary evaluation indicator system of park service effectiveness, drawing on the connotation of park service effectiveness and relevant domestic and international research
[11][42]~[44]. Afterward, the research team examined the data quality and update status gathered by park management authorities of the city, as well as the quality of available social perception data and active sensing data. From the perspectives of indicator computation feasibility and data update sustainability, an evaluation indicator system was finally constructed, which includes two main categories, 4 primary indicators, 12 secondary indicators, and 26 tertiary indicators (Tab.3). Different types of parks vary in terms of indicator selection and weight setting.
4.2.2 Generation of Park Impressions
The research team developed 10 lexicons, namely “Park Name,” “Entrance and Exit,” “Time,” “Mood,” “Activity,” “Natural Landscape,” “Cultural Landscape,” “Amusement,” “Service,” and “Sports and Fitness.” These lexicons were used to generate impressions for each park. For example, “camping” is the most mentioned activity at the Grand Canal Forest Park; “boating” is the most popular amusement at Taoranting Park; and “sunflowers” are the favorite natural landscape at Olympic Forest Park. These park impressions reflect the alignment between the park service effectiveness with visitors’ needs, offering park managers a reference for improvement directions.
4.2.3 Deficiency Diagnosis and Optimization Recommendation Generation
After computing the indicators, park managers can view the overall ranking of park service effectiveness across the city, the ranking of specific indicators, and strengths and weaknesses of individual parks. By utilizing “indicator–policy–responsibility” knowledge graphs, optimization recommendations were generated based on the park’s weaker indicators. Fig.5 shows an example of a policy graph, where orange nodes representing keywords of policy texts, blue nodes for keyword groups, and purple nodes for the index of policy texts in the knowledge base.
The research team integrated the relative strength and weakness indicators, deficiency diagnoses, and corresponding optimization recommendations into a standardized report (Fig.6), enabling park managers to quickly and clearly understand how to translate indicator results into actionable plans. The research team developed an engine for report template design, which, through built-in intelligent diagnostic logic and configurable generation cycles, can automatically and regularly produce reports, ensuring the continuity of park service effectiveness monitoring. During the report generation, the system integrated a Transformer-based language model to perform in-depth analysis and polish the initial report upon the template to improve its readability and professionalism. The final reports, which were refined and more concise, were formulated in formats such as PDF and Word, making it easy for users to access, edit, and share via various platforms and devices.
For the intelligent Q&A service, the research team built a localized knowledge base for park service effectiveness by Chatglm3, a fine-tuned open-source general LLM, combined with urban governance-related datasets. This knowledge base was then embedded into the front-end page of the park management system, allowing managers to easily and conveniently access decision-making knowledge related to park service effectiveness (Fig.7).
4.3 Application Effectiveness
In Beijing’s evaluations of park service effectiveness, the method proposed by the research team has enabled a comprehensive and in-depth assessment of park service effectiveness from a city-wide perspective. This has effectively overcome the conventional limitations like single-category focus, lack of continuity, and insufficient application support. Currently, the evaluations have not only become one of the key tasks for the Beijing Municipal Forestry and Parks Bureau in promoting the Garden City construction, but also facilitated park management through continuous and precise tracking of park service effectiveness. Taking the Central Green Forest Park as an example, in the third quarter of 2023, the park was diagnosed as “underperforming” in terms of operations and maintenance. Thereby the report recommended the park to improve its sanitation and cleanliness. Through an overall inspection of all the sanitation indicators, the park improved its path cleanliness service effectiveness by 4.5% in the fourth-quarter evaluation of the same year.
5 Conclusions
This paper reviews the connotation of park service efficacy, identifies deficiencies in actual operation through investigations and surveys, and proposes a framework for enhancing park service efficacy that is innovative both in terms of process coverage and the integration of technological applications. This framework excavates multi-sourced data and applies intelligent algorithms to swiftly generate reports, containing a multitude of evaluation indicators, deficiency diagnoses of individual parks, and optimization recommendations, facilitating the extensive deployment of refined park service effectiveness evaluation. In addition, this framework can also take into account the actual demands of park managers, and provide Q&A service to better assist them in converting optimization recommendations into action plans. This has been applied in the work of enhancing park service effectiveness in Beijing and achieved remarkable results.
With the continuous increase in the public’s demand for enhancing park service effectiveness, this paper attempts to expand related research and to provide new insights and paths for bridging the gap between academic outcomes and the practical application of decision-making in park management. Follow-up research, on the one hand, should further improve the evaluation workflow and explore closed-loop solutions integrating planning, design, construction, and operation. On the other hand, it is essential to continuously deepen the level of technology integration in each application step—e.g., the continuous optimization of the base model, the construction and retrieval of the knowledge base— to generate more refined deficiency diagnoses and optimization recommendations. In addition, researchers should also consider how to introduce geographical and spatial concepts into language models, so as to improve the accuracy of Q&A service and to respond to diverse geographical and spatial requests, promoting its effectiveness and applicability.