Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies
Shuguang ZHOU, Kefa ZHOU, Jinlin WANG, Genfang YANG, Shanshan WANG
Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies
Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.
cluster analysis / compositional data / geochemical anomaly / mineral exploration
[1] |
Abdel-Halim R E, Abdel-Aal R E (1998). Classification of urinary stones by cluster analysis of ionic composition data. Comput Methods Programs Biomed, 58(1): 69–81
CrossRef
Google scholar
|
[2] |
Afzal P, Khakzad A, Moarefvand P, Omran N R, Esfandiari B, Alghalandis Y F (2010). Geochemical anomaly separation by multifractal modeling in Kahang (Gor Gor) porphyry system, Central Iran. J Geochem Explor, 104(1–2): 34–46
CrossRef
Google scholar
|
[3] |
Agharezaei M, Hezarkhani A (2016). Delineation of geochemical anomalies based on Cu by the boxplot as an exploratory data analysis (EDA) method and concentration-volume (C-V) fractal modeling in Mesgaran mining area, Eastern Iran. Open Journal of Geology, 6(10): 1269–1278
CrossRef
Google scholar
|
[4] |
Agterberg F P (2012). Multifractals and geostatistics. J Geochem Explor, 122: 113–122
CrossRef
Google scholar
|
[5] |
Aitchison J (1982). The statistical analysis of compositional data. J R Stat Soc B, 44(2): 139–177
|
[6] |
Aitchison J (1999). Logratios and natural laws in compositional data analysis. Math Geol, 31(5): 563–580
CrossRef
Google scholar
|
[7] |
Aitchison J, Barcelo-Vidal C, Martin-Fernandez J A, Pawlowsky-Glahn V (2000). Logratio analysis and compositional distance. Math Geol, 32(3): 271–275
CrossRef
Google scholar
|
[8] |
Aitchison J, Egozcue J J (2005). Compositional data analysis: where are we and where should we be heading? Math Geol, 37(7): 829–850
CrossRef
Google scholar
|
[9] |
Bölviken B, Stokke P R, Feder J, Jossang T (1992). The fractal nature of geochemical landscapes. J Geochem Explor, 43(2): 91–109
CrossRef
Google scholar
|
[10] |
Bounessah M, Atkin B P (2003). An application of exploratory data analysis (EDA) as a robust non-parametric technique for geochernical mapping in a semi-arid climate. Appl Geochem, 18(8): 1185–1195
CrossRef
Google scholar
|
[11] |
Buccianti A (2013). Is compositional data analysis a way to see beyond the illusion? Comput Geosci, 50: 165–173
CrossRef
Google scholar
|
[12] |
Carranza E J M (2009). Geochemical anomaly and mineral prospectivity mapping in GIS. Handbook of exploration and environmental geochemistry, 11. Elsevier Science
|
[13] |
Carranza E J M (2010). Catchment basin modelling of stream sediment anomalies revisited: incorporation of EDA and fractal analysis. Geochem Explor Environ Anal, 10(4): 365–381
CrossRef
Google scholar
|
[14] |
Carranza E J M (2011). Analysis and mapping of geochemical anomalies using logratio-transformed stream sediment data with censored values. J Geochem Explor, 110(2): 167–185
CrossRef
Google scholar
|
[15] |
Carranza E J M, Hale M (1997). A catchment basin approach to the analysis of reconnaissance geochemical-geological data from Albay Province, Philippines. J Geochem Explor, 60(2): 157–171
CrossRef
Google scholar
|
[16] |
Cheng Q, Agterberg F P (2009). Singularity analysis of ore-mineral and toxic trace elements in stream sediments. Comput Geosci, 35(2): 234–244
CrossRef
Google scholar
|
[17] |
Cheng Q, Agterberg F P, Bonham-Carter G F (1996). A spatial analysis method for geochemical anomaly separation. J Geochem Explor, 56(3): 183–195
CrossRef
Google scholar
|
[18] |
Davis J C (2002). Statistics and Data Analysis in Geology (3rd ed). New York, Chichester, Brisbane, Toronto, Singapore: John Wiley and Sons
|
[19] |
Egozcue J J, Pawlowsky-Glahn V, Mateu-Figueras G, Barcelo-Vidal C (2003). Isometric logratio transformations for compositional data analysis. Math Geol, 35(3): 279–300
CrossRef
Google scholar
|
[20] |
Eilermann M, Post C, Schwarz D, Leufke S, Schembecker G, Bramsiepe C (2017). Generation of an equipment module database for heat exchangers by cluster analysis of industrial applications. Chem Eng Sci, 167: 278–287
CrossRef
Google scholar
|
[21] |
Fatehi M, Asadi H H (2017). Application of semi-supervised fuzzy c-means method in clustering multivariate geochemical data, a case study from the Dalli Cu-Au porphyry deposit in central Iran. Ore Geol Rev, 81: 245–255
CrossRef
Google scholar
|
[22] |
Fattahi H (2016). Indirect estimation of deformation modulus of an in situ rock mass: an ANFIS model based on grid partitioning, fuzzy c-means clustering and subtractive clustering. Geosci J, 20(5): 681–690
CrossRef
Google scholar
|
[23] |
Filzmoser P, Hron K, Reimann C (2009). Univariate statistical analysis of environmental (compositional) data: problems and possibilities. Sci Total Environ, 407(23): 6100–6108
CrossRef
Google scholar
|
[24] |
Filzmoser P, Hron K, Reimann C (2010). The bivariate statistical analysis of environmental (compositional) data. Sci Total Environ, 408(19): 4230–4238
CrossRef
Google scholar
|
[25] |
Ghosh T, Kanchan R (2014). Geoenvironmental appraisal of groundwater quality in Bengal alluvial tract, India: a geochemical and statistical approach. Environ Earth Sci, 72(7): 2475–2488
CrossRef
Google scholar
|
[26] |
Han J, Kamber M (2006). Data Minning: Concepts and Techniques (2nd ed). Beijing: China Machine Press
|
[27] |
Hassanpour S, Afzal P (2013). Application of concentration–number (C–N) multifractal modeling for geochemical anomaly separation in Haftcheshmeh porphyry system, NW Iran. Arab J Geosci, 6(3): 957–970
CrossRef
Google scholar
|
[28] |
Hawkes H E, Webb J S (1962). Geochemistry in Mineral Exploration.New York: Harper
|
[29] |
He G Q, Chen S D, Xu X, Li J Y, Hao J (2004). An Introduction to the Explanatory Text of the Map of Tectonics of Xinjiang and Its Neighbouring Area (1:250000). Beijing: Geological Publishing House (in Chinese)
|
[30] |
Howarth R J (1983). Statistics and Data Analysis in Geochemical Prospecting. Handbook of Exploration Geochemistry, 2. Amsterdam-Oxford-New York Elsevier
|
[31] |
Kim T, Moon D C, Park W B, Park K H, Ko G W (2007). Classification of springs of Jeju Island using cluster analysis of annual fluctuations in discharge variables: investigation of the regional groundwater system. Geosci J, 11(4): 397–413
CrossRef
Google scholar
|
[32] |
Kitzig M C, Kepic A, Kieu D T (2017). Testing cluster analysis on combined petrophysical and geochemical data for rock mass classification. Explor Geophys, 48(3): 344–352
CrossRef
Google scholar
|
[33] |
Lee J Y, Song S H (2007). Groundwater chemistry and ionic ratios in a western coastal aquifer of Buan, Korea: implication for seawater intrusion. Geosci J, 11(3): 259–270
CrossRef
Google scholar
|
[34] |
Leite M L C (2016). Applying compositional data methodology to nutritional epidemiology. Stat Methods Med Res, 25(6): 3057–3065
CrossRef
Google scholar
|
[35] |
Meng H, Song Y, Song F, Shen H (2011). Research and application of cluster and association analysis in geochemical data processing. Computat Geosci, 15(1): 87–98
CrossRef
Google scholar
|
[36] |
Sahraei Parizi H, Samani N (2013). Geochemical evolution and quality assessment of water resources in the Sarcheshmeh copper mine area (Iran) using multivariate statistical techniques. Environ Earth Sci, 69(5): 1699–1718
CrossRef
Google scholar
|
[37] |
Parsa M, Maghsoudi A, Yousefi M, Carranza E J M (2017). Multifractal interpolation and spectrum–area fractal modeling of stream sediment geochemical data: implications for mapping exploration targets. J Afr Earth Sci, 128: 5–15
CrossRef
Google scholar
|
[38] |
Pazand K, Hezarkhani A, Ataei M, Ghanbari Y (2011). Application of multifractal modeling technique in systematic geochemical stream sediment survey to identify copper anomalies: a case study from Ahar, Azarbaijan, Northwest Iran. Chemie der Erde- Geochemistry, 71(4): 397–402
|
[39] |
Reimann C, Filzmoser P (2000). Normal and lognormal data distribution in geochemistry: death of a myth. Consequences for the statistical treatment of geochemical and environmental data. Environmental Geology, 39(9): 1001–1014
CrossRef
Google scholar
|
[40] |
Reimann C, Filzmoser P, Garrett R G (2002). Factor analysis applied to regional geochemical data: problems and possibilities. Appl Geochem, 17(3): 185–206
CrossRef
Google scholar
|
[41] |
Reimann C, Filzmoser P, Garrett R G (2005). Background and threshold: critical comparison of methods of determination. Sci Total Environ, 346(1–3): 1–16
CrossRef
Google scholar
|
[42] |
Reimann C, Garrett R G (2005). Geochemical background- concept and reality. Sci Total Environ, 350(1–3): 12–27
CrossRef
Google scholar
|
[43] |
Rock N M S (1988). Numerical Geology. Lecture Notes in Earth Sciences, 18. New York, Berlin, Heidelberg: Springer-Verlag
|
[44] |
Stück H, Koch R, Siegesmund S (2013). Petrographical and petrophysical properties of sandstones: statistical analysis as an approach to predict material behaviour and construction suitability. Environ Earth Sci, 69(4): 1299–1332
CrossRef
Google scholar
|
[45] |
Su Y, Tang H, Hou G, Liu C (2006). Geochemistry of aluminous A-type granites along Darabut teconic belt in west Junggar, Xinjiang. Geochimica, 35(1): 55–67 (in Chinese)
|
[46] |
Templ M, Filzmoser P, Reimann C (2008). Cluster analysis applied to regional geochemical data: problems and possibilities. Appl Geochem, 23(8): 2198–2213
CrossRef
Google scholar
|
[47] |
Templ M, Hron K, Filzmoser P (2017). Exploratory tools for outlier detection in compositional data with structural zeros. J Appl Stat, 44(4): 734–752
CrossRef
Google scholar
|
[48] |
Tolosana-Delgado R, McKinley J (2016). Exploring the joint compositional variability of major components and trace elements in the Tellus soil geochemistry survey (Northern Ireland). Appl Geochem, 75: 263–276
CrossRef
Google scholar
|
[49] |
Tukey J W (1977). Exploratory Data Analysis. Reading: Addison-Wesley
|
[50] |
Wang L, Wang Y, Zhang W, Xu C, An Z (2014). Multivariate statistical techniques for evaluating and identifying the environmental significance of heavy metal contamination in sediments of the Yangtze River, China. Environ Earth Sci, 71(3): 1183–1193
CrossRef
Google scholar
|
[51] |
Wang X Q, Xie X J, Zhang B R, Hou Q Y (2011). Geochemical probe into China’s continental crust. Acta Geoscientica Sinica, 32: 65–83 (in Chinese)
|
[52] |
Xie X, Mu X, Ren T (1997). Geochemical mapping in China. J Geochem Explor, 60(1): 99–113
CrossRef
Google scholar
|
[53] |
Xie X, Wang X, Zhang Q, Zhou G, Cheng H, Liu D, Cheng Z, Xu S (2008). Multi-scale geochemical mapping in China. Geochem Explor Environ Anal, 8(3–4): 333–341
CrossRef
Google scholar
|
[54] |
Yusta I, Velasco F, Herrero J M (1998). Anomaly threshold estimation and data normalization using EDA statistics: application to lithogeochemical exploration in lower Cretaceous Zn-Pb carbonate-hosted deposits, northern Spain. Appl Geochem, 13(4): 421–439
CrossRef
Google scholar
|
[55] |
Zhang C, Huang X (1992). The ages and tectonic settings of ophiolites in West Junggar, Xinjiang. Geological Review, 38(6): 509–524 (in Chinese)
|
[56] |
Zhang F (2003). The study of geological characteristics of the gold associated minerals and gold vine of Hatu gold deposit. Journal of Xinjiang Nonferrous Metals, 26(3): 5–6 (in Chinese)
|
[57] |
Zhu Y, An F, Xu C, Guo H, Xia F, Xiao F, Zhang F, Lin C, Qiu T, Wei S (2013a). Geology and Au-Cu Deposits in the Hatu and its Adjacent Region (Xinjiang): Evolution and Prospecting Model. Beijing: Geological Publishing House
|
[58] |
Zhu Y, Chen B, Xu X, Qiu T, An F (2013b). A new geological map of the western Junggar, north Xinjiang (NW China): implications for Paleoenvironmental reconstruction. Episodes, 36(3): 205–220
|
[59] |
Zumlot T, Goodell P, Howari F (2009). Geochemical mapping of New Mexico, USA, using stream sediment data. Environmental Geology, 58(7): 1479–1497
CrossRef
Google scholar
|
[60] |
Zuo R (2012). Exploring the effects of cell size in geochemical mapping. J Geochem Explor, 112: 357–367
CrossRef
Google scholar
|
[61] |
Zuo R, Cheng Q (2008). Mapping singularities- a technique to identify potential Cu mineral deposits using sediment geochemical data, an example for Tibet, west China. Mineral Mag, 72(1): 531–534
CrossRef
Google scholar
|
[62] |
Zuo R, Wang J, Chen G, Yang M (2015). Identification of weak anomalies: a multifractal perspective. J Geochem Explor, 148: 12–24
CrossRef
Google scholar
|
[63] |
Zuo R, Xia Q, Wang H (2013a). Compositional data analysis in the study of integrated geochemical anomalies associated with mineralization. Appl Geochem, 28: 202–211
CrossRef
Google scholar
|
[64] |
Zuo R, Xia Q, Zhang D (2013b). A comparison study of the C-A and S-A models with singularity analysis to identify geochemical anomalies in covered areas. Appl Geochem, 33: 165–172
CrossRef
Google scholar
|
/
〈 | 〉 |