PyLUR: Efficient software for land use regression modeling the spatial distribution of air pollutants using GDAL/OGR library in Python
Xuying Ma, Ian Longley, Jennifer Salmond, Jay Gao
PyLUR: Efficient software for land use regression modeling the spatial distribution of air pollutants using GDAL/OGR library in Python
• PyLUR comprises four modules for developing and applying a LUR model.
• It considers both conventional and novel potential predictor variables.
• GDAL/OGR libraries are used to do spatial analysis in the modeling and prediction.
• Developed on Python platform, PyLUR is rather efficient in data processing.
Land use regression (LUR) models have been widely used in air pollution modeling. This regression-based approach estimates the ambient pollutant concentrations at un-sampled points of interest by considering the relationship between ambient concentrations and several predictor variables selected from the surrounding environment. Although conceptually quite simple, its successful implementation requires detailed knowledge of the area, expertise in GIS, statistics, and programming skills, which makes this modeling approach relatively inaccessible to novice users. In this contribution, we present a LUR modeling and pollution-mapping software named PyLUR. It uses GDAL/OGR libraries based on the Python platform and can build a LUR model and generate pollutant concentration maps efficiently. This self-developed software comprises four modules: a potential predictor variable generation module, a regression modeling module, a model validation module, and a prediction and mapping module. The performance of the newly developed PyLUR is compared to an existing LUR modeling software called RLUR (with similar functions implemented on R language platform) in terms of model accuracy, processing efficiency and software stability. The results show that PyLUR out-performs RLUR for modeling in the Bradford and Auckland case studies examined. Furthermore, PyLUR is much more efficient in data processing and it has a capability to handle detailed GIS input data.
LUR / Air pollution modelling / GIS spatial analysis / GDAL/OGR Python / Pollutant concentration mapping
[1] |
Akita Y (2014a). LURTools: ArcGIS Toolbox for Land Use Regression (LUR) Model, Available online at the website of /www.unc.edu/~akita/lurtools
|
[2] |
Akita Y, Baldasano J M, Beelen R, Cirach M, De Hoogh K, Hoek G, Nieuwenhuijsen M, Serre M L, De Nazelle A (2014b). Large scale air pollution estimation method combining land use regression and chemical transport modeling in a geostatistical framework. Environmental Science & Technology, 48(8): 4452–4459
CrossRef
Google scholar
|
[3] |
Beelen R, Hoek G, Vienneau D, Eeftens M, Dimakopoulou K, Pedeli X, Tsai M Y, Künzli N, Schikowski T, Marcon A, Eriksen K T, Raaschou-Nielsen O, Stephanou E, Patelarou E, Lanki T, Yli-Tuomi T, Declercq C, Falq G, Stempfelet M, Birk M, Cyrys J, von Klot S, Nádor G, Varró M J, Dėdelė A, Gražulevičienė R, Mölter A, Lindley S, Madsen C, Cesaroni G, Ranzi A, Badaloni C, Hoffmann B, Nonnemacher M, Krämer U, Kuhlbusch T, Cirach M, de Nazelle A, Nieuwenhuijsen M, Bellander T, Korek M, Olsson D, Strömgren M, Dons E, Jerrett M, Fischer P, Wang M, Brunekreef B, de Hoogh K (2013). Development of NO2 and NOx land use regression models for estimating air pollution exposure in 36 study areas in Europe–The ESCAPE project. Atmospheric Environment, 72: 10–23
CrossRef
Google scholar
|
[4] |
Briggs D J, Collins S, Elliott P, Fischer P, Kingham S, Lebret E, Pryl K, Van Reeuwijk H, Smallbone K, Van Der Veen A (1997). Mapping urban air pollution using GIS: A regression-based approach. International Journal of Geographical Information Science, 11(7): 699–718
CrossRef
Google scholar
|
[5] |
European Study of Cohorts for Air Pollution Effects (2010). ESCAPE exposure assessment manual. Available online at the website of www.escapeproject.eu/manuals
|
[6] |
Hoek G, Beelen R, De Hoogh K, Vienneau D, Gulliver J, Fischer P, Briggs D (2008). A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmospheric Environment, 42(33): 7561–7578
CrossRef
Google scholar
|
[7] |
Keller J P, Olives C, Kim S Y, Sheppard L, Sampson P D, Szpiro A A, Oron A P, Lindström J, Vedal S, Kaufman J D (2015). A unified spatiotemporal modeling approach for predicting concentrations of multiple air pollutants in the multi-ethnic study of atherosclerosis and air pollution. Environmental Health Perspectives, 123(4): 301–309
CrossRef
Google scholar
|
[8] |
Kim J H (2009). Estimating classification error rate: Repeated cross-validation, repeated hold-out and bootstrap. Computational Statistics & Data Analysis, 53(11): 3735–3745
CrossRef
Google scholar
|
[9] |
Kohavi R (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on Artificial Intelligence (IJCAI), 14(2), 1137–1145
|
[10] |
Li S, Zou B, Fang X, Lin Y (2019). Time series modeling of PM2.5 concentrations with residual variance constraint in eastern mainland China during 2013–2017. Science of the Total Environment,
CrossRef
Google scholar
|
[11] |
Liu W, Li X, Chen Z, Zeng G, León T, Liang J, Huang G, Gao Z, Jiao S, He X, Lai M (2015). Land use regression models coupled with meteorology to model spatial and temporal variability of NO2 and PM10 in Changsha, China. Atmospheric Environment, 116: 272–280
CrossRef
Google scholar
|
[12] |
Liu Z, Xie M, Tian K, Gao P (2017). GIS-based analysis of population exposure to PM2.5 air pollution: A case study of Beijing. Journal of Environmental Sciences (China), 59: 48–53
CrossRef
Google scholar
|
[13] |
Ma X, Longley I, Gao J, Kachhara A, Salmond J (2019). A site-optimised multi-scale GIS based land use regression model for simulating local scale patterns in air pollution. Science of the Total Environment, 685: 134–149
CrossRef
Google scholar
|
[14] |
Marcon A, de Hoogh K, Gulliver J, Beelen R, Hansell A L (2015). Development and transferability of a nitrogen dioxide land use regression model within the Veneto region of Italy. Atmospheric Environment, 122: 696–704
CrossRef
Google scholar
|
[15] |
Masiol M, Zíková N, Chalupa D C, Rich D Q, Ferro A R, Hopke P K (2018). Hourly land-use regression models based on low-cost PM monitor data. Environmental Research, 167: 7–14
CrossRef
Google scholar
|
[16] |
Meng X, Chen L, Cai J, Zou B, Wu C F, Fu Q, Zhang Y, Liu Y, Kan H (2015). A land use regression model for estimating the NO2 concentration in Shanghai, China. Environmental Research, 137: 308–315
CrossRef
Google scholar
|
[17] |
Miskell G, Salmond J, Longley I, Dirks K N (2015). A novel approach in quantifying the effect of urban design features on local-scale air pollution in central urban areas. Environmental Science & Technology, 49(15): 9004–9011
CrossRef
Google scholar
|
[18] |
Miskell G, Salmond J A, Williams D E (2018). Use of a handheld low-cost sensor to explore the effect of urban design features on local-scale spatial and temporal air quality variability. Science of the Total Environment, 619-620: 480–490
CrossRef
Google scholar
|
[19] |
Morley D W, Gulliver J (2018). A land use regression variable generation, modelling and prediction tool for air pollution exposure assessment. Environmental Modelling & Software, 105: 17–23
CrossRef
Google scholar
|
[20] |
Muttoo S, Ramsay L, Brunekreef B, Beelen R, Meliefste K, Naidoo R N (2018). Land use regression modelling estimating nitrogen oxides exposure in industrial south Durban, South Africa. Science of the Total Environment, 610-611: 1439–1447
CrossRef
Google scholar
|
[21] |
Open Source Geospatial Foundation (2008). GDAL-OGR: Geospatial Data Abstraction Library/Simple Features Library Software, Available online at https://www.gdal.org/
|
[22] |
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12: 2825–2830
|
[23] |
Sanner M F (1999). Python: A programming language for software integration and development. Journal of Molecular Graphics & Modelling, 17(1): 57–61
|
[24] |
Saucy A, Röösli M, Künzli N, Tsai M Y, Sieber C, Olaniyan T, Baatjies R, Jeebhay M, Davey M, Flückiger B, Naidoo R, Dalvie M, Badpa M, de Hoogh K (2018). Land use regression modelling of outdoor NO2 and PM2.5 concentrations in three low income areas in the Western Cape Province, South Africa. International Journal of Environmental Research and Public Health, 15(7): 1452-1465
CrossRef
Google scholar
|
[25] |
Seabold S, Perktold J (2010). Statsmodels: Econometric and statistical modeling with python. In: Proceedings of the 9th Python in Science Conference, 57, 61
|
[26] |
Weissert L F, Salmond J A, Miskell G, Alavi-Shoshtari M, Williams D E (2018). Development of a microscale land use regression model for predicting NO2 concentrations at a heavy trafficked suburban area in Auckland, NZ. Science of the Total Environment, 619-620: 112–119
CrossRef
Google scholar
|
[27] |
Westra E (2013). Python geospatial development. Birmingham: Packt Publishing Ltd.
|
[28] |
Wu J, Li J, Peng J, Li W, Xu G, Dong C (2015). Applying land use regression model to estimate spatial variation of PM2.5 in Beijing, China. Environmental Science and Pollution Research International, 22(9): 7045–7061
CrossRef
Google scholar
|
[29] |
Xu H, Bechle M J, Wang M, Szpiro A A, Vedal S, Bai Y, Marshall J D (2019a). National PM2.5 and NO2 exposure models for China based on land use regression, satellite measurements, and universal kriging. Science of the Total Environment, 655: 423–433
CrossRef
Google scholar
|
[30] |
Xu S, Zou B, Lin Y, Zhao X, Li S, Hu C (2019b). Strategies of method selection for fine-scale PM2.5 mapping in an intra-urban area using crowdsourced monitoring. Atmospheric Measurement Techniques. 28;12(5):2933–48
|
[31] |
Zhai L, Zou B, Fang X, Luo Y, Wan N, Li S (2016). Land use regression modeling of PM2.5 concentrations at optimized spatial scales. Atmosphere, 8(1): 1–15
CrossRef
Google scholar
|
[32] |
Zou B, Pu Q, Bilal M, Weng Q, Zhai L, Nichol J E (2016). High-resolution satellite map- ping of fine particulates based on geographically weighted regression. IEEE Geoscience and Remote Sensing Letters, 13(4): 495–499
CrossRef
Google scholar
|
/
〈 | 〉 |