Landslide susceptibility assessment using machine learning with a novel SHAP-based sampling strategy

Lei-Lei Liu , Can Duan , Jun-Hua Gao , Hao Xiao , Wen-Qing Zhu , Can Yang

Geoscience Frontiers ›› 2026, Vol. 17 ›› Issue (2) : 102188

PDF
Geoscience Frontiers ›› 2026, Vol. 17 ›› Issue (2) :102188 DOI: 10.1016/j.gsf.2025.102188
research-article
Landslide susceptibility assessment using machine learning with a novel SHAP-based sampling strategy
Author information +
History +
PDF

Abstract

The landslide and non-landslide samples are important inputs for machine learning-based landslide susceptibility assessment. Compared to landslide samples, non-landslide samples generally present higher uncertainty due to random sampling. However, most sampling strategies (e.g., the feature space-based) for non-landslides only consider the characteristics of a single factor or the overall characteristics of all factors, which subsequently leads to either excessive artificial concentration of non-landslide samples or sampling information redundancy. To address these issues, a SHapley Additive exPlanations (SHAP) based sampling strategy considering combined characteristics of landslide conditioning factors (LCFs) is proposed. This strategy sorts the importance of LCFs based on SHAP algorithm and generates multiple sampling spaces using different numbers of LCFs in the sense of importance order. The optimal sampling space is selected according to the Bayesian optimization algorithm. Then, random forest (RF) and extreme gradient boosting (XGBoost) models are utilized to assess the susceptibility of Chaling County, Yanling County, and Guidong County, China, based on the proposed strategy and traditional random sampling. The results indicate that, compared with the traditional RF and XGBoost models, the improved models show better performance with an 8.2% and 9.0% increase in the AUC, respectively. Furthermore, the SHAP-based sampling framework demonstrates good adaptability across the study areas with different geological and geomorphic conditions, suggesting its potential transferability to other regions, although local optimization of parameter settings may still be required.

Keywords

Landslide susceptibility / SHapley Additive exPlanations (SHAP) / Interpretable machine learning / Sampling strategy / Landslide conditioning factors (LCFs)

Cite this article

Download citation ▾
Lei-Lei Liu, Can Duan, Jun-Hua Gao, Hao Xiao, Wen-Qing Zhu, Can Yang. Landslide susceptibility assessment using machine learning with a novel SHAP-based sampling strategy. Geoscience Frontiers, 2026, 17(2): 102188 DOI:10.1016/j.gsf.2025.102188

登录浏览全文

4963

注册一个新账户 忘记密码

CRediT authorship contribution statement

Lei-Lei Liu: Writing - review & editing, Writing - original draft, Resources, Investigation, Funding acquisition, Conceptualization. Can Duan: Writing - original draft, Validation, Software, Methodology. Jun-Hua Gao: Writing - original draft, Validation, Software, Methodology. Hao Xiao: Writing - review & editing, Writing - original draft, Validation, Software, Funding acquisition. Wen-Qing Zhu: Writing - review & editing, Validation, Formal analysis. Can Yang: Writing - review & editing, Writing - original draft, Validation, Methodology, Formal analysis, Conceptualization.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

The work described in this paper was supported by grants from the Project of Science and Technology Support Program of Guizhou Province (Project No. Qiankehe Support [2023] General 137) and the Fundamental Research Funds for Central Universities of the Central South University (Project No. 2023ZZTS0477). The financial support is greatly acknowledged.

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.gsf.2025.102188.

References

[1]

Abbas, F., Zhang, F., Abbas, F., Ismail, M., Iqbal, J., Hussain, D., Khan, G., Alrefaei, A.F., Albeshr, M.F., 2023. Landslide susceptibility mapping: Analysis of different feature selection techniques with Artificial Neural Network tuned by Bayesian and Metaheuristic algorithms. Remote Sens. 15 (17), 4330.

[2]

Aditian, A., Kubota, T., Shinohara, Y., 2018. Comparison of GIS-based landslide susceptibility models using frequency ratio, logistic regression, and artificial neural network in a tertiary region of Ambon, Indonesia. Geomorphology 318 (1), 101-111.

[3]

Ali, S.A., Parvin, F., Vojteková J., Costache, R., Linh, N.T.T., Pham, Q.B., Vojtek, M., Gigovic, L., Ahmad, A., Ghorbani, M.A., 2021. GIS-based landslide susceptibility modeling: A comparison between fuzzy multi-criteria and machine learning algorithms. Geosci. Front. 12 (2), 857-876.

[4]

Althuwaynee, O.F., Pradhan, B., 2016. Semi-quantitative landslide risk assessment using GIS-based exposure analysis in Kuala Lumpur City. Geomat. Nat. Haz. Risk 8 (2), 706-732.

[5]

Arabameri, A., Pradhan, B., Rezaei, K., Lee, C.W., 2019. Assessment of landslide susceptibility using statistical- and artificial intelligence-based FR-RF integrated model and multiresolution DEMs. Remote Sens. 11 (9), 999.

[6]

Arabameri, A., Saha, S., Roy, J., Chen, W., Blaschke, T., Tien Bui, D., 2020. Landslide susceptibility evaluation and management using different machine learning methods in the Gallicash River Watershed, Iran. Remote Sens. 12 (3), 475.

[7]

Bera, S., Talukdar, S., Nguyen, K.A., Liou, Y.A., Guru, B., Chatterjee, R., Ramana, G., 2025. Enhancing evacuation shelter suitability in compound hazard-prone regions with a Bayesian optimized convolutional neural network approach. Int. J. Disaster Risk Reduc. 119, 105306.

[8]

Bian, X., Fan, Z.Y., Liu, J.X., Li, X.Z., Zhao, P., 2024. Regional 3D geological modelling along metro lines based on stacking ensemble model. Undergr. Space 18, 65-82.

[9]

Chang, Z.L., Catani, F., Huang, F.M., Liu, G.Z., Meena, S.R., Huang, J.S., Zhou, C.B., 2023. Landslide susceptibility prediction using slope unit-based machine learning models considering the heterogeneity of conditioning factors. J. Rock Mech. Geotech. Eng. 15 (5), 1127-1143.

[10]

Chauhan, V., Gupta, L., Dixit, J., 2025. Landslide susceptibility assessment for Uttarakhand, a Himalayan state of India, using multi-criteria decision making, bivariate, and machine learning models. Geoenviron. Disasters 12, 2.

[11]

Chen, T.Q., Guestrin, C., 2016. XGBoost:A scalable tree boosting system. In: Association for Computing Machinery, San Francisco, California, USA, pp. 785-794.

[12]

Chen, X.Y., Wang, Y., Wang, X., Li, Y.X., Qi, J., Lin, Q.G., 2025. Risk assessment of landslide casualty under incomplete information-Tienshan and Kunlun Mountainous regions of Central Asia. Int. J. Disaster Risk Reduc. 116, 105057.

[13]

Confuorto, P., Franceschini, R., Scarpitta, L., Casagli, N., Morelli, S., Raspini, F., Tofani, V., Moretti, S., 2025. Event-based landslide inventory through very high-resolution optical images and field surveys. Geoenviron. Disasters 12 (1), 23.

[14]

Descals, A., Verger, A., Yin, G.F., Filella, I., Peñuelas, J., 2023. Local interpretation of machine learning models in remote sensing with SHAP: the case of global climate constraints on photosynthesis phenology. Int. J. Remote Sens. 44 (10), 3160-3173.

[15]

Dou, J., Yunus, A.P., Bui, D.T., Merghadi, A., Sahana, M., Zhu, Z.F., Chen, C.W., Khosravi, K., Yang, Y., Pham, B.T., 2019. Assessment of advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island, Japan. Sci. Total Environ. 662, 332-346.

[16]

Dou, J., Yunus, A.P., Merghadi, A., Shirzadi, A., Nguyen, H., Hussain, Y., Avtar, R., Chen, Y.L., Pham, B.T., Yamagishi, H., 2020. Different sampling strategies for predicting landslide susceptibilities are deemed less consequential with deep learning. Sci. Total Environ. 720, 137320.

[17]

Gomez, H., Kavzoglu, T., 2005. Assessment of shallow landslide susceptibility using artificial neural networks in Jabonosa River Basin, Venezuela. Eng. Geol. 78 (1-2), 11-27.

[18]

Guzzetti, F., Reichenbach, P., Ardizzone, F., Cardinali, M., Galli, M., 2006. Estimating the quality of landslide susceptibility models. Geomorphology 81 (1-2), 166-184.

[19]

Hong, H.Y., 2024. Landslide susceptibility assessment using locally weighted learning integrated with machine learning algorithms. Expert Syst. Appl. 237, 121678.

[20]

Hong, H.Y., Wang, D.S., Zhu, A.X., Wang, Y., 2024. Landslide susceptibility mapping based on the reliability of landslide and non-landslide sample. Expert Syst. Appl. 243, 122933.

[21]

Hu, Q., Zhou, Y., Wang, S.X., Wang, F.T., 2020. Machine learning and fractal theory models for landslide susceptibility mapping: Case study from the Jinsha River Basin. Geomorphology 351, 106975.

[22]

Huang, F., Xiong, H., Jiang, S.-H., Yao, C., Fan, X., Catani, F., Chang, Z., Zhou, X., Huang, J., Liu, K., 2024. Modelling landslide susceptibility prediction: A review and construction of semi-supervised imbalanced theory. Earth Sci. Rev. 250, 104700.

[23]

Huang, F.M., Xiong, H.W., Yao, C., Catani, F., Zhou, C.B., Huang, J.S., 2023. Uncertainties of landslide susceptibility prediction considering different landslide types. J. Rock Mech. Geotech. Eng. 15 (11), 2954-2972.

[24]

Huang, F.M., Yan, J., Fan, X.M., Yao, C., Huang, J.S., Chen, W., Hong, H.Y., 2022. Uncertainty pattern in landslide susceptibility prediction modelling: Effects of different landslide boundaries and spatial shape expressions. Geosci. Front. 13 (2), 101317.

[25]

Huang, F.M., Ye, Z., Jiang, S.H., Huang, J., Chang, Z., Chen, J., 2021. Uncertainty study of landslide susceptibility prediction considering the different attribute interval numbers of environmental factors and different data-based models. Catena 202, 105250.

[26]

Ilia, I., Tsangaratos, P., 2016. Applying weight of evidence method and sensitivity analysis to produce a landslide susceptibility map. Landslides 13 (2), 379-397.

[27]

Jiang, Z.Y., Wang, M., Liu, K., 2023. Comparisons of convolutional neural network and other machine learning methods in landslide susceptibility assessment: A case study in Pingwu. Remote Sens. 15 (3), 798.

[28]

Kadavi, P.R., Lee, C.W., Lee, S., 2018. Application of ensemble-based machine learning models to landslide susceptibility mapping. Remote Sens. 10 (8), 1252.

[29]

Kainthura, P., Sharma, N., 2021. Machine learning driven landslide susceptibility prediction for the Uttarkashi region of Uttarakhand in India. Georisk 16 (3), 570-583.

[30]

Kavzoglu, T., Sahin, E.K., Colkesen, I., 2014. Landslide susceptibility mapping using GIS-based multi-criteria decision analysis, support vector machines, and logistic regression. Landslides 11 (3), 425-439.

[31]

Kumar, C., Walton, G., Santi, P., Luza, C., 2023. An ensemble approach of feature selection and machine learning models for regional landslide susceptibility mapping in the Arid Mountainous Terrain of southern Peru. Remote Sens. 15 (5), 1376.

[32]

Li, Y.J., Qian, C., Zhang, B., Xu, N.X., 2024. Reliability and landslide consequence analysis of long heterogeneous soil infrastructure slopes: A parallel computing investigation. Reliab. Eng. Syst. Saf. 251, 110322.

[33]

Liu, L.L., Xiao, H., Zhang, Y.L., Yang, C., 2024b. An improved buffer-controlled sampling strategy for landslide susceptibility assessment considering the spatial heterogeneity of conditioning factors. Bull. Eng. Geol. Environ. 83 (12), 512.

[34]

Liu, L.L., Yang, C., Huang, F.M., Wang, X.M., 2021a. Landslide susceptibility mapping by attentional factorization machines considering feature interactions. Geomat. Nat. Haz. Risk 12 (1), 1837-1861.

[35]

Liu, L.L., Yang, C., Wang, X.M., 2021b. Landslide susceptibility assessment using feature selection-based machine learning models. Geomech. Eng. 25 (1), 1-16.

[36]

Liu, L.L., Zhang, Y.L., Xiao, T., Yang, C., 2022a. A frequency ratio-based sampling strategy for landslide susceptibility assessment. Bull. Eng. Geol. Environ. 81 (9), 360.

[37]

Liu, L.L., Zhang, Y.L., Zhang, S.H., Shu, B., Xiao, T., 2022b. Machine learning with a susceptibility index-based sampling strategy for landslide susceptibility assessment. Geocarto Int. 37 (27), 15683-15713.

[38]

Liu, L.-L., Zhao, S.-L., Yang, C., Zhang, W., 2024a. Quantifying uncertainty in landslide susceptibility mapping due to sampling randomness. Int. J. Disaster Risk Reduct. 114, 104966.

[39]

Lucchese, L.V., de Oliveira, G.G., Pedrollo, O.C., 2021. Investigation of the influence of nonoccurrence sampling on landslide susceptibility assessment using artificial neural networks. Catena 198, 105067.

[40]

Lundberg, S.M., Erion, G., Chen, H., DeGrave, A., Prutkin, J.M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., Lee, S.I., 2020. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2 (1), 56-67.

[41]

Lundberg, S.M., Lee, S.-I., 2017. A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30, 4768-4777.

[42]

Luo, J.Y., Zhang, L.L., Yang, H.Q., Wei, X., Liu, D.S., Xu, J.B., 2021. Probabilistic model calibration of spatial variability for a physically-based landslide susceptibility model. Georisk 16 (4), 728-745.

[43]

Marin, R.J., Mattos, Á.J., 2020. Physically-based landslide susceptibility analysis using Monte Carlo simulation in a tropical mountain basin. Georisk 14 (3), 192-205.

[44]

Mashimbye, Z.E., Loggenberg, K., 2023. A scoping review of landform classification using geospatial methods. Geomatics 3 (1), 93-114.

[45]

Pradhan, B., Sameen, M.I., 2017. Effects of the spatial resolution of digital elevation models and their products on landslide susceptibility mapping. In: Pradhan B. (Ed.), Laser Scanning Applications in Landslide Assessment. Springer International Publishing, Cham, pp. 133-150.

[46]

Qin, C., Zhou, J., 2023. On the seismic stability of soil slopes containing dual weak layers: true failure load assessment by finite-element limit-analysis. Acta Geotechnica 18 (6), 3153-3175.

[47]

Rehman, A., Sajjad, M., Song, J.X., Riaz, M.T., Mehmood, M.S., Ahamad, M.I., 2024. Integrated frequency ratio-analytical hierarchy and geospatial techniques-based earthquake risk assessment in mountainous cities: a case from the Northwestern Himalayas. Georisk 19, 389-409.

[48]

Sameen, M.I., Pradhan, B., Lee, S., 2020. Application of convolutional neural networks featuring Bayesian optimization for landslide susceptibility assessment. Catena 186, 104249.

[49]

Shano, L., Raghuvanshi, T.K., Meten, M., 2020. Landslide susceptibility evaluation and hazard zonation techniques - a review. Geoenviron. Dis. 7 (1), 18.

[50]

Shi, X.S., Zhao, J.D., 2020. Practical estimation of compression behavior of clayey/silty sands using equivalent void-ratio concept. J. Geotech. Geoenviron. 146 (6), 04020046.

[51]

Sholichin, M., Othman, F., Prayogo, T.B., Rahardjo, S.S.P., 2024. Assessing Landslide susceptibility and formulating adaptation strategies in the Konto Watershed, East Java, Indonesia. Int. J. Disaster Risk Reduc. 113, 104797.

[52]

Sultana, N., Tan, S.K., Hossen, M.F., 2024. Landslide risk assessment by integrating hazards and vulnerability indices in Southeast Bangladesh. Int. J. Disaster Risk Reduc. 114, 104991.

[53]

Sun, D., Shi, S., Wen, H., Xu, J., Zhou, X., Wu, J., 2021a. A hybrid optimization method of factor screening predicated on GeoDetector and Random Forest for landslide susceptibility mapping. Geomorphology 379, 107623.

[54]

Sun, D., Wu, X., Wen, H., Gu, Q., 2023a. A LightGBM-based landslide susceptibility model considering the uncertainty of non-landslide samples. Geomat. Nat. Haz. Risk 14 (1), 2213807.

[55]

Sun, D.L., Chen, D.L., Zhang, J.L., Mi, C.L., Gu, Q.Y., Wen, H.J., 2023b. Landslide susceptibility mapping based on interpretable machine learning from the perspective of geomorphological differentiation. Land 12 (5), 1018.

[56]

Sun, D.L., Gu, Q.Y., Wen, H.J., Xu, J.H., Zhang, Y.L., Shi, S.X., Xue, M.M., Zhou, X.Z., 2022. Assessment of landslide susceptibility along mountain highways based on different machine learning algorithms and mapping units by hybrid factors screening and sample optimization. Gondwana Res. 123, 89-106.

[57]

Sun, D.L., Xu, J.H., Wen, H.J., Wang, D.Z., 2021b. Assessment of landslide susceptibility mapping based on Bayesian hyperparameter optimization: A comparison between logistic regression and random forest. Eng. Geol. 281, 105972.

[58]

Thiery, Y., Maquaire, O., Fressard, M., 2014. Application of expert rules in indirect approaches for landslide susceptibility assessment. Landslides 11 (3), 411-424.

[59]

Vakhshoori, V., Zare, M., 2016. Landslide susceptibility mapping by comparing weight of evidence, fuzzy logic, and frequency ratio methods. Geomat. Nat. Haz. Risk 7 (5), 1731-1752.

[60]

Vakhshoori, V., Zare, M., 2018. Is the ROC curve a reliable tool to compare the validity of landslide susceptibility maps? Geomat. Nat. Haz. Risk 9 (1), 249-266.

[61]

Wang, Y.M., Feng, L.W., Li, S.J., Ren, F., Du, Q.Y., 2020. A hybrid model considering spatial heterogeneity for landslide susceptibility mapping in Zhejiang Province, China. Catena 188, 104425.

[62]

Woodard, J.B., Mirus, B.B., Crawford, M.M., Or, D., Leshchinsky, B.A., Allstadt, K.E., Wood, N.J., 2023. Mapping landslide susceptibility over large regions with limited data. J. Geophys. Res.: Earth Surf. 128 (5), e2022JF006810.

[63]

Xi, C.J., Han, M., Hu, X.W., Liu, B., He, K., Luo, G., Cao, X.C., 2022. Correction to: Effectiveness of Newmark based sampling strategy for coseismic landslide susceptibility mapping using deep learning, support vector machine, and logistic regression. Bull. Eng. Geol. Environ. 81 (5), 208.

[64]

Xiao, T., Zhang, L.M., 2023. Data-driven landslide forecasting: Methods, data completeness, and real-time warning. Eng. Geol. 317, 107068.

[65]

Yang, C., Liu, L.L., Huang, F.M., Huang, L., Wang, X.M., 2022. Machine learning-based landslide susceptibility assessment with optimized ratio of landslide to non-landslide samples. Gondwana Res. 123, 198-216.

[66]

Yang, C., Wang, J., Zhang, G., 2024a. A novel framework for debris flow susceptibility assessment considering the uncertainty of sample selection. Geomat. Nat. Haz. Risk. 15 (1), 2425732.

[67]

Yang, Y., Ma, X., Ding, W., Wen, H., Sun, D., 2024b. A novel dataset replenishment strategy integrating time-series InSAR for refined landslide susceptibility mapping in karst regions. Water 16 (17), 2414.

[68]

Zeng, T.R., Wu, L.Y., Peduto, D., Glade, T., Hayakawa, Y.S., Yin, K.L., 2023. Ensemble learning framework for landslide susceptibility mapping: Different basic classifier and ensemble strategy. Geosci. Front. 14 (6), 101645.

[69]

Zhang, H.J., Song, Y.X., Xu, S.L., He, Y.S., Li, Z.W., Yu, X.y., Liang, Y., Wu, W.C., Wang, Y., 2022. Combining a class-weighted algorithm and machine learning models in landslide susceptibility mapping: A case study of Wanzhou section of the Three Gorges Reservoir, China. Comput. Geosci. 158, 104966.

[70]

Zhang, R., Yang, Y.J., Wang, T.Y., Liu, A.M.Y., Lv, J.C., He, X., Fu, Y., Zhang, B., Dai, K.R., Liu, G.X., 2024. Co-seismic landslide susceptibility mapping for the Luding earthquake area based on heterogeneous ensemble machine learning models. Int. J. Digital Earth 17 (1), 2409337.

[71]

Zhang, S., Li, C., Peng, J.Y., Zhou, Y.L., Wang, S.R., Chen, Y.M., Tang, Y., 2023a. Fatal landslides in China from 1940 to 2020: Occurrences and vulnerabilities. Landslides 20, 1243-1264.

[72]

Zhang, W., Wu, Z.Z., Peng, C., Li, S., Dong, Y.K., Yuan, W.H., 2023b. Modelling large-scale landslide using a GPU-accelerated 3D MPM with an efficient terrain contact algorithm. Comput. Geotech. 158, 105411.

[73]

Zhou, C., Yin, K.L., Cao, Y., Ahmed, B., Li, Y.Y., Catani, F., Pourghasemi, H.R., 2018. Landslide susceptibility modeling applying machine learning methods: A case study from Longju in the Three Gorges Reservoir area, China. Comput. Geosci. 112, 23-37.

[74]

Zhou, X.Z., Wen, H.J., Li, Z.W., Zhang, H., Zhang, W.G., 2022. An interpretable model for the susceptibility of rainfall-induced shallow landslides based on SHAP and XGBoost. Geocarto Int. 37 (26), 13419-13450.

[75]

Zhu, A.X., Turner, M., 2022. How is the third law of geography different? Ann. GIS 28 (1), 57-67.

[76]

Zhu, Y.C., Sun, D.L., Wen, H.J., Zhang, Q., Ji, Q., Li, C.M., Zhou, P.G., Zhao, J.J., 2024. Considering the effect of non-landslide sample selection on landslide susceptibility assessment. Geomat. Nat. Haz. Risk 15 (1), 2392778.

PDF

0

Accesses

0

Citation

Detail

Sections
Recommended

/