Comparison and Applicability Study of Analysis Methods for Social Media Text Data: Taking Perception of Urban Parks in Beijing as an Example
Zhenyu SHANG, Kexin CHENG, Yuqing JIAN, Zhifang WANG
Comparison and Applicability Study of Analysis Methods for Social Media Text Data: Taking Perception of Urban Parks in Beijing as an Example
The booming Internet technology and media have generated large sets of social media data, with which the social sensing analyses based on users' reviews have become a research hotspot and have been increasingly applied in the study of urban park usage and perception. However, most existing studies adopt a single model for text data processing. To fill this gap, this study aims to compare social media text data analysis methods and assess their advantages, disadvantages and applicability in park perception research. The Lexicon-based classification analysis model (lexicon model) and LDA (Latent Dirichlet Allocation) model widely used in relevant research were selected. Based on text data obtained from public reviews of 10 urban parks in Beijing on Dianping, this study explored the perception topic distribution of each park and all parks in general, and compared the classification results of perception topics between these two models. Results show that the lexicon model is conducive to the parallel comparison of perception frequency between parks, while the LDA model can directly reflect each park's characteristics and visitors' perception preferences; the combined use of the two models can optimize park perception assessment. Results from the two methods reveal that visitors to urban parks in Beijing focused more on their social recreation needs and visual aesthetics brought by the natural landscape, as well as conditions of the transportation facilities and the consumption in the parks. This research can provide optimization suggestions for the selection and use of social media text analysis methods, and a basis and guidance for park construction and management improvement.
● Exploring the advantages, disadvantages, and applicability of two text analysis models
● The lexicon model is more suitable for parallel comparison between perceived objects by users
● The Latent Dirichlet Allocation (LDA) model can better capture the characteristics of each individual perceived object
● Taking advantage of the two models’ strengths is vital for optimizing landscape perception assessment
Social Sensing / Text Analysis / Lexicon / Latent Dirichlet Allocation (LDA) / Urban Park / Landscape Perception
[1] |
Ferreira, A. P., Silva, T. H., & Loureiro, A. A. (2020) Uncovering spatiotemporal and semantic aspects of tourists mobility using social sensing. Computer Communications, (160), 240– 252.
|
[2] |
Li, Y., Guo, J., & Chen, Y. (2022). A new approach for tourists' visual behavior patterns and perception evaluation based on multi-source data. Journal of Geo-information Science, 24(10), 2004– 2020.
|
[3] |
Liu, Y., Liu, X., Gao, S., Gong, L., Kang, C., Zhi, Y., Chi, G., & Shi, L. (2015). Social sensing: A new approach to understanding our socioeconomic environments. Annals of the Association of American Geographers, 105(3), 512– 530.
|
[4] |
Liu, Y. (2016). Revisiting several basic geographical concepts: A social sensing perspective. Acta Geographica Sinica, 71(4), 564– 575.
|
[5] |
Mao, T., Wu, Y., & Huang, W. (2023). Content mining and sentiment analysis of online comments for ethnic museums in autonomous regions. Economic Geography, 43(8), 229– 236.
|
[6] |
He, X. (2019). Research on Social Sensing and Spatiotemporal Pattern of Xiong'an New District Based on Weibo Data [Master's thesis]. Hebei Normal University.
|
[7] |
Zhang, S., & Zhou, W. (2018). Recreational visits to urban parks and factors affecting park visits: Evidence from geotagged social media data. Landscape and Urban Planning, (180), 27– 35.
|
[8] |
Donahue, M. L., Keeler, B. L., Wood, S. A., Fisher, D. M., Hamstead, Z. A., & McPhearson, T. (2018). Using social media to understand drivers of urban park visitation in the Twin Cities, MN. Landscape and Urban Planning, (175), 1– 10.
|
[9] |
Li, F., Li, F., Li, S., & Long, Y. (2019). Deciphering the recreational use of urban parks: Experiments using multi-source big data for all Chinese cities. Science of the Total Environment, (701), 134896.
|
[10] |
Liang, H., & Zhang, Q. (2021). Temporal and spatial assessment of urban park visits from multiple social media data sets: A case study of Shanghai, China. Journal of Cleaner Production, (297), 126682.
|
[11] |
Van Berkel, D. B., Tabrizian, P., Dorning, M. A., Smart, L., Newcomb, D., Mehaffey, M., Neale, A., & Meentemeyer, R. K. (2018). Quantifying the visual-sensory landscape qualities that contribute to cultural ecosystem services using social media and LiDAR. Ecosystem services, (31), 326– 335.
|
[12] |
Oteros-Rozas, E., Martín-López, B., Fagerholm, N., Bieling, C., & Plieninger, T. (2018). Using social media photos to explore the relation between cultural ecosystem services and landscape features across five European sites. Ecological Indicators, (94), 74– 86.
|
[13] |
Richards, D. R., & Friess, D. A. (2015). A rapid indicator of cultural ecosystem service usage at a fine spatial scale: Content analysis of social media photographs. Ecological Indicators, (53), 187– 195.
|
[14] |
Pan, Y., & Li, J. (2021). Landscape preference based on user-generated photograph metadata: The case of Xixi National Wetland Park. Natural Protected Areas, (1), 100– 108.
|
[15] |
Zhu, X., Gao, M., Zhang, R., & Zhang, B. (2021). Quantifying emotional differences in urban green spaces extracted from photos on social networking sites: A study of 34 parks in three cities in northern China. Urban Forestry & Urban Greening, (62), 127133.
|
[16] |
Wartmann, F. M., Acheson, E., & Purves, R. S. (2018). Describing and comparing landscapes using tags, texts, and free lists: An interdisciplinary approach. International Journal of Geographical Information Science, 32(8), 1572– 1592.
|
[17] |
Yan, Y., Chen, J., & Wang, Z. (2020). Mining public sentiments and perspectives from geotagged social media data for appraising the post-earthquake recovery of tourism destinations. Applied Geography, (123), 102306.
|
[18] |
Marcotte, C., & Stokowski, P. A. (2021). Place meanings and national parks: A rhetorical analysis of social media texts. Journal of Outdoor Recreation and Tourism, (35), 100383.
|
[19] |
Bai, H., Song, Z., Liang, S., Zhang, P., & Zhang, G. (2023). Imagery perception analysis and comprehensive attraction evaluation of tourism destinations based on Internet text data—Taking Nanjing City as example. Areal Research and Development, 42(4), 89.
|
[20] |
Zhao, Y., Pang, S., & Wu, Z. (2021). Research on geographic semantic ontology model based on social sensing data for emergency management of events. Information Science, (2), 44– 53.
|
[21] |
Chen, Y., Gong, C., Fan, Y., Li, X., Liang, Y., & Hu, M. (2022). Spatio-temporal variation assessment of urban waterlogging in Zhengzhou using social media data. Journal of China Hydrology, 42(3), 26, 48– 52.
|
[22] |
Li, S., Zhao, F., Zhou, Y., Tian, X., & Huang, H. (2022). Analysis of public opinion and disaster loss estimates from typhoons based on Microblog data. Journal of Tsinghua University (Science and Technology), 62(1), 43– 51.
|
[23] |
Yang, B., & Zhang, J. (2017). Research on tourism image and perception of Tianmu Mountain based on network text analysis— Based on travel notes and comments of Ctrip. Journal of Fujian Forestry Science and Technology, 44(4), 118– 125.
|
[24] |
Wang, X., & Xia, M. (2018). Research on tourist preference and satisfaction in Huangshan Scenic Spot based on network review data. Tourism Overview, (18), 59– 60.
|
[25] |
Wight, A. C. (2020). Visitor perceptions of European Holocaust Heritage: A social media analysis. Tourism Management, (81), 104142.
|
[26] |
Xu, Z., Dong, J., Chen, Z., Fu, W., Wang, M., & Dong, J. (2021). Image Perception of the historical ancient town scenic spot of Yunshuiyao. Journal of Chinese Urban Forestry, 19(2), 115– 120.
|
[27] |
Park, S. B., Kim, J., Lee, Y. K., & Ok, C. M. (2020). Visualizing theme park visitors' emotions using social media analytics and geospatial analytics. Tourism Management, (80), 104127.
|
[28] |
Widmar, N. O., Bir, C., Clifford, M., & Slipchenko, N. (2020). Social media sentimentas an additional performance measure? Examples from iconic theme park destinations. Journal of Retailing and Consumer Services, (56), 102157.
|
[29] |
Wan, C., Shen, G. Q., & Choi, S. (2021). Eliciting users' preferences and values in urban parks: Evidence from analyzing social media data from Hong Kong. Urban Forestry & Urban Greening, (62), 127172.
|
[30] |
Li, L., Zhang, C., Han, L., Qing, L., & Ji, H. (2021). Research on multi-scale evaluation system of parks based on comment text—Taking Chengdu parks as an example. Intelligent City, (2), 3– 6.
|
[31] |
Jiang, Q., Wang, G., Liang, X., & Liu, N. (2022). Research on the perception of cultural ecosystem services in urban parks via analyses of online comment data. Landscape Architecture Frontiers, 10(5), 32– 51.
|
[32] |
Jing, F., Sun, H., & Long, D. (2017). Tourist experience elements structure characteristics analysis of Xixi National Wetland Park based on web text. Journal of Zhejiang University (Science Edition), 44(5), 623– 630.
|
[33] |
Wang, X., & Li, X. (2017). Research on the analysis of social services value of forest park in Beijing based on network big data. Chinese Landscape Architecture, (10), 14– 18.
|
[34] |
Zhao, S., & Liu, B. (2019). Research on visitor perception of urban parks based on analysis of network text data—Take the main urban area of Nanjing as an example. 2019 Urban Development and Planning Proceedings( pp. 263−272). Chinese Society for Urban Studies.
|
[35] |
Gao, X., Jin, Y., Wang, X., & Hao, J. (2021). Research on product perceptual evaluation method based on online review mining. Modern Manufacturing Engineering, (12), 13– 20.
|
[36] |
Lu, X. (2014). Research on text clustering algorithm based on K-means. Computer Programming Skills & Maintenance, (24), 33– 35.
|
[37] |
Wang, D., Li, J., & Shi, Y. (2020). Methods of government document clustering based on K-means algorithm. Software Guide, 19(6), 201– 204.
|
[38] |
Ma, W., Chen, G., Li, X., Su, W., Chai, Y., Pu, Y., Zeng, J., & Liu, X. (2021). Chinese comment classification based on Naive Bayesian algorithm. Journal of Computer Applications, 41(S2), 31– 35.
|
[39] |
Permana, F. C., Rosmansyah, Y., & Abdullah, A. S. (2017). Naive Bayes as opinion classifier to evaluate students satisfaction based on student sentiment in Twitter social media. Journal of Physics: Conference Series, (893), 012051.
|
[40] |
Han, X., & Li, Y. (2022). Research on the influencing factors of social media rumor-refuting information dissemination effect in emergencies. Information Studies: Theory & Application, 45(8), 97– 103.
|
[41] |
Zeng, Y., Li, Z., & Zhou, Y. (2020). Article feature extraction and flow control based on text mining. Electronic Technology & Software Engineering, (2), 176– 177.
|
[42] |
Wang, Z., Miao, Y., Xu, M., Zhu, Z., Qureshi, S., & Chang, Q. (2021). Revealing the differences of urban parks' services to human wellbeing based upon social media data. Urban Forestry & Urban Greening, (63), 127233.
|
[43] |
Wang, Z., Zhu, Z., Xu, M., & Qureshi, S. (2021). Fine-grained assessment of greenspace satisfaction at regional scale using content analysis of social media and machine learning. Science of the Total Environment, (776), 145908.
|
[44] |
Zheng, T., Yan, Y., Zhang, W., Zhu, J., Wang, C., Rong, Y., & Lu, H. (2022). Landscape assessment on urban parks using social media data. Acta Ecologica Sinica, 42(2), 561– 568.
|
[45] |
Taecharungroj, V., & Mathayomchan, B. (2019). Analysing TripAdvisor reviews of tourist attractions in Phuket, Thailand. Tourism Management, (75), 550– 568.
|
[46] |
Dong, S., & Wang, Q. (2019). LDA-based tourist perception dimension recognition: Research framework and empirical research—Taking the National Mine Park as an example. Journal of Beijing Union University (Humanities and Social Sciences), 17(2), 42– 49.
|
[47] |
Liang, C., & Li, R. (2020). Tourism destination image perception analysis based on the Latent Dirichlet Allocation model and dominant semantic dimensions: A case of the Old Town of Lijiang. Progress in Geography, 39(4), 614– 626.
|
[48] |
Song, Y., Wang, R., Fernandez, J., & Li, D. (2021). Investigating sense of place of the Las Vegas Strip using online reviews and machine learning approaches. Landscape and Urban Planning, (205), 103956.
|
[49] |
Zhou, W. (2021) Research on Tourism Destination Evaluation Based on Improved AHP of LDA: A Case Study of 5A Scenic Spots in Jiangxi Province. [Master's thesis]. Jiangxi University of Finance and Economics.
|
[50] |
Beijing Statistics Bureau (2021). Beijing statistics yearbook. China Statistics Press.
|
[51] |
Zhu, Z. (2020). An Assessment Framework of Green Space Satisfaction Using Social Media Data: Content Analysis with Machine Learning. [Master's thesis]. Peking University.
|
[52] |
Wang, Z., Miao, Y., Zhu, Z., Zhou, J., & Wang, S. (2020). A method for landscape service identification of parks. (No. CN111310444A). China National Intellectual Property Administration.
|
[53] |
Buchel, S., & Frantzeskaki, N. (2015). Citizens' voice: A case study about perceived ecosystem services by urban park users in Rotterdam, the Netherlands. Ecosystem Services, (12), 169– 177.
|
[54] |
Huang, S., Pearce, J., Wen, J., Dowling, R. K., & Smith, A. J. (2020). Segmenting Western Australian national park visitors by perceived benefits: A factor-item mixed approach. International Journal of Tourism Research, 22(6), 814– 824.
|
[55] |
Willemen, L., Verburg, P. H., Hein, L., & van Mensvoort, M. E. (2008). Spatial characterization of landscape functions. Landscape and Urban Planning, 88(1), 34– 43.
|
[56] |
Sun, R., Li, F., & Chen, L. (2019). A demand index for recreational ecosystem services associated with urban parks in Beijing, China. Journal of Environmental Management, (251), 109612.
|
[57] |
van Riper, C. J., Kyle, G. T., Sutton, S. G., Barnes, M., & Sherrouse, B. C. (2012). Mapping outdoor recreationists' perceived social values for ecosystem services at Hinchinbrook Island National Park, Australia. Applied Geography, 35(1−2), 164– 173.
|
[58] |
Wang, J., Wang, M., & Du, B. (2019). A study of the change trend of social concern in the field of consumption in China—The LDA Model analysis based on the text of Daily Economic News List in People's Daily Online (2007—2017). Journal of Baoding University, 32(2), 41– 49.
|
[59] |
Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, (3), 993– 1022.
|
[60] |
Brandt, T., Bendler, J., & Neumann, D. (2017). Social media analytics and value creation in urban smart tourism ecosystems. Information & Management, 54(6), 703– 713.
|
[61] |
Röder, M., Both, A., & Hinneburg, A. (2015). Exploring the Space of Topic Coherence Measures. Proceedings of the Eighth ACM International Conference on Web Search and Data Mining(pp. 399−408). Association for Computing Machinery.
|
[62] |
Stevens, K., Kegelmeyer, P., Andrzejewski, D., & Buttler, D. (2012). Exploring Topic Coherence over Many Models and Many Topics. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning(pp. 952−961). Association for Computational Linguistics.
|
[63] |
Syed, S., & Weber, C. T. (2018). Using machine learning to uncover latent research topics in fishery models. Reviews in Fisheries Science & Aquaculture, 26(3), 319– 336.
|
[64] |
Chen, Y., Zhu, Y., & Fu, G. (2022). Visitor perception toward outstanding universal value of Xinjiang Tianshan—Based on web text analysis. Special Zone Economy, 398(3), 124– 128.
|
[65] |
Liu, Q., Wang, X., & Liu, J. (2022). Study on relationship among tourist perceived value, satisfaction and environmental responsibility behavior in forest park. Ecological Economy, 38(2), 137– 141.
|
[66] |
Cao, K., & Chen, Y. (2021). Service evaluation of Shenzhen parks based on social data. Special Zone Economy, (4), 127– 129.
|
[67] |
Ye, Y., & Qiu, H. (2022). Urban park image perception based on network text analysis. Journal of Chinese Urban Forestry, 20(1), 90– 95.
|
[68] |
Han, D., Wang, C., & Xiao, M. (2018). Text categorization scheme based on semi-supervised learning and Latent Dirichlet allocation model. Computer Engineering and Design, 39(10), 3265– 3271.
|
[69] |
Guo, X., Ding, J., Jiang, H., & Chen, Z. (2020). ZeroNet text content analysis based on semi-supervised LDA topic model. Information Technology, (3), 32– 38.
|
/
〈 | 〉 |