Dynamic Spatio-Temporal Tweet Mining for Event Detection: A Case Study of Hurricane Florence

Mahdi Farnaghi , Zeinab Ghaemi , Ali Mansourian

International Journal of Disaster Risk Science ›› 2020, Vol. 11 ›› Issue (3) : 378 -393.

PDF
International Journal of Disaster Risk Science ›› 2020, Vol. 11 ›› Issue (3) : 378 -393. DOI: 10.1007/s13753-020-00280-z
Article

Dynamic Spatio-Temporal Tweet Mining for Event Detection: A Case Study of Hurricane Florence

Author information +
History +
PDF

Abstract

Extracting information about emerging events in large study areas through spatiotemporal and textual analysis of geotagged tweets provides the possibility of monitoring the current state of a disaster. This study proposes dynamic spatio-temporal tweet mining as a method for dynamic event extraction from geotagged tweets in large study areas. It introduces the use of a modified version of ordering points to identify the clustering structure to address the intrinsic heterogeneity of Twitter data. To precisely calculate the textual similarity, three state-of-the-art text embedding methods of Word2vec, GloVe, and FastText were used to capture both syntactic and semantic similarities. The impact of selected embedding algorithms on the quality of the outputs was studied. Different combinations of spatial and temporal distances with the textual similarity measure were investigated to improve the event detection outcomes. The proposed method was applied to a case study related to 2018 Hurricane Florence. The method was able to precisely identify events of varied sizes and densities before, during, and after the hurricane. The feasibility of the proposed method was qualitatively evaluated using the Silhouette coefficient and qualitatively discussed. The proposed method was also compared to an implementation based on the standard density-based spatial clustering of applications with noise algorithm, where it showed more promising results.

Keywords

Disaster management / Hurricane Florence / Natural language processing / Spatio-temporal tweet analysis / Tweet clustering / Twitter

Cite this article

Download citation ▾
Mahdi Farnaghi, Zeinab Ghaemi, Ali Mansourian. Dynamic Spatio-Temporal Tweet Mining for Event Detection: A Case Study of Hurricane Florence. International Journal of Disaster Risk Science, 2020, 11(3): 378-393 DOI:10.1007/s13753-020-00280-z

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Arcaini P, Bordogna G, Ienco D, Sterlacchini S. User-driven geo-temporal density-based exploration of periodic and not periodic events reported in social networks. Information Sciences, 2016, 340–341: 122-143.

[2]

Benhardus J, Kalita J. Streaming trend detection in twitter. International Journal of Web Based Communities, 2013, 9(1): 122-139.

[3]

Ben-Lhachemi N, Nfaoui EH. Using tweets embeddings for hashtag recommendation in twitter. Procedia Computer Science, 2018, 127: 7-15.

[4]

Bifet A. Adaptive stream mining: Pattern learning and mining from evolving data streams, 2010, Amsterdam: IOS Press

[5]

Bojanowski P, Grave E, Joulin A, Mikolov T. Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 2017, 5: 135-146.

[6]

Capdevila J, Cerquides J, Nin J, Torres J. Tweet-SCAN: An event discovery technique for geo-located tweets. Pattern Recognition Letters, 2017, 93: 58-68.

[7]

Cheng, T., and T. Wicks. 2014. Event detection using twitter: A spatio-temporal approach. PloS One 9(6): Article e97807.

[8]

Croitoru A, Wayant N, Crooks A, Radzikowski J, Stefanidis A. Linking cyber and physical spaces through community detection and clustering in social media feeds. Computers, Environment and Urban Systems, 2015, 53: 47-64.

[9]

Cui W, Wang P, Du Y, Chen X, Guo D, Li J, Zhou Y. An algorithm for event detection based on social media data. Neurocomputing, 2017, 254: 53-58.

[10]

Ester, M., H.-P. Kriegel, J. Sander, and X. Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the international conference on knowledge discovery and data mining, 226–231, 2-4 August 1996, Portland, OR, USA.

[11]

Farnaghi M, Mansourian A. Disaster planning using automated composition of semantic OGC web services: A case study in sheltering. Computers, Environment and Urban Systems, 2013, 41: 204-218.

[12]

Fócil-Arias, C., J. Zúñiga, G. Sidorov, I. Batyrshin, and A. Gelbukh. 2017. A tweets classifier based on cosine similarity. Working notes of CLEF 2017—Conference and Labs of the Evaluation Forum, Dublin, Ireland, 11-14 September 2017.

[13]

Ghaemi, Z., and M. Farnaghi. 2019. A varied density-based clustering approach for event detection from heterogeneous twitter data. ISPRS International Journal of Geo-Information 8(2): Article 82.

[14]

Guerra L, Robles V, Bielza C, Larrañaga P. A comparison of clustering quality indices using outliers and noise. Intelligent Data Analysis, 2012, 16(4): 703-715.

[15]

Hasan M, Orgun MA, Schwitter R. A survey on real-time event detection from the Twitter data stream. Journal of Information Science, 2018, 44(4): 443-463.

[16]

Hecht, B., L. Hong, B. Suh, and E.H. Chi. 2011. Tweets from Justin Bieber’s heart: The dynamics of the “location” field in user profiles. In Proceedings of the ACM CHI annual conference on human factors in computing systems, 237–246, 7-12 May 2011, Vancouver, BC, Canada.

[17]

Huang Q, Xiao Y. Geographic situational awareness: Mining tweets for disaster preparedness, emergency response, impact, and recovery. ISPRS International Journal of Geo-Information, 2015, 4(3): 1549-1568.

[18]

Huang, Y., Y. Li, and J. Shan. 2018. Spatial-temporal event detection from geotagged tweets. ISPRS International Journal of Geo-Information 7(4): Article 150.

[19]

Idrissi, A., H. Rehioui, A. Laghrissi, and S. Retal. 2015. An improvement of DENCLUE algorithm for the data clustering. In Proceedings of the 2015 5th International Conference on Information & Communication Technology and Accessibility (ICTA), 21-23 December 2015, Marrakech, Morocco. IEEE. https://doi.org/10.1109/icta.2015.7426936.

[20]

Joshi A, Kaur R. A review: Comparative study of various clustering techniques in data mining. International Journal of Advanced Research in Computer Science and Software Engineering, 2013, 3(3): 55-57.

[21]

Kaleel SB, Abhari A. Cluster-discovery of twitter messages for event detection and trending. Journal of Computational Science, 2015, 6: 47-57.

[22]

Kirilenko AP, Stepchenkova SO. Sochi 2014 Olympics on twitter: Perspectives of hosts and guests. Tourism Management, 2017, 63: 54-65.

[23]

Krajewski WF, Ceynar D, Demir I, Goska R, Kruger A, Langel C, Mantilla R, Niemeier J Real-time flood forecasting and information system for the State of Iowa. Bulletin of the American Meteorological Society, 2016, 98(3): 539-554.

[24]

Lee C-H. Mining spatio-temporal information on microblogging streams using a density-based online clustering method. Expert Systems with Applications, 2012, 39(10): 9623-9641.

[25]

Lee, K., D. Palsetia, R. Narayanan, M.M.A. Patwary, A. Agrawal, and A.N. Choudhary. 2011. Twitter trending topic classification. In Proceedings of the 11th IEEE international conference on data mining workshops, 251–258, 11 December 2011, Vancouver, BC, Canada.

[26]

Liu, P., D. Zhou, and N. Wu. 2007. VDBSCAN: Varied density based spatial clustering of applications with noise. In Proceedings of the 2007 international conference on service systems and service management, 1-4, 9-11 June 2007, Chengdu, China.

[27]

Mary SAL, Sivagami AN, Rani MU. Cluster validity measures dynamic clustering algorithms. ARPN Journal of Engineering and Applied Sciences, 2015, 10(9): 4009-4012.

[28]

Mikolov, T., K. Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the 1st international conference on learning representations, 1-12, 2-4 May 2013, Scottsdale, AZ, USA.

[29]

Morchid M, Portilla Y, Josselin D, Dufour R, Altman E, El-Beze M, Cossu J-V, Linarès G, Reiffers-Masson A. An author-topic based approach to cluster tweets and mine their location. Procedia Environmental Sciences, 2015, 27: 26-29.

[30]

Nguyen, M.D, and W.-Y. Shin. 2017. DBSTexC: Density-based spatio-textual clustering on twitter. In Proceedings of the 9th IEEE/ACM international conference on advances in social networks analysis and mining, 23–26, 31 July-3 August 2017, Sydney, Australia.

[31]

Nguyen T, Larsen ME, O’Dea B, Nguyen DT, Yearwood J, Phung D, Venkatesh S, Christensen H. Kernel-based features for predicting population health indices from geocoded social media data. Decision Support Systems, 2017, 102: 22-31.

[32]

Niederkrotenthaler T, Till B, Garcia D. Celebrity suicide on twitter: Activity, content and network analysis related to the death of Swedish DJ Tim Bergling alias Avicii. Journal of Affective Disorders, 2019, 245: 848-855.

[33]

Parimala M, Lopez D, Senthilkumar NC. A survey on density based clustering algorithms for mining large spatial databases. International Journal of Advanced Science and Technology, 2011, 31(1): 59-66.

[34]

Pennington, J., R. Socher, and C. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543, 25–29 October 2014, Doha, Qatar.

[35]

Phelan, O., K. McCarthy, and B. Smyth. 2009. Using twitter to recommend real-time topical news. In Proceedings of the 2009 ACM conference on recommender systems, 385–388, 23-25 October 2009, New York, NY, USA.

[36]

Reddy BGO, Ussenaiah M. Literature survey on clustering techniques. IOSR Journal of Computer Engineering, 2012, 3(1): 1-50.

[37]

Rousseeuw PJ. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 1987, 20: 53-65.

[38]

Sander J, Ester M, Kriegel H-P, Xu X. Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery, 1998, 2(2): 169-194.

[39]

Schubert, E., and M. Gertz. 2018. Improving the cluster structure extracted from OPTICS plots. In Proceedings of the conference “lernen, wissen, daten, analysen”, 318–329, 22-24 August 2018, Mannheim, Germany.

[40]

Schubert, E., J. Sander, M. Ester, H.P. Kriegel, and X. Xu. 2017. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. ACM Transactions on Database Systems 42(3): Article 19.

[41]

Sit MA, Koylu C, Demir I. Identifying disaster-related tweets and their semantic, spatial and temporal context using deep learning, natural language processing and spatial analysis: A case study of Hurricane Irma. International Journal of Digital Earth, 2019, 12(11): 1205-1229.

[42]

Srijith PK, Hepple M, Bontcheva K, Preotiuc-Pietro D. Sub-story detection in twitter with hierarchical Dirichlet processes. Information Processing & Management, 2017, 53(4): 989-1003.

[43]

Steiger E, Resch B, Zipf A. Exploration of spatiotemporal and semantic clusters of Twitter data using unsupervised neural networks. International Journal of Geographical Information Science, 2016, 30(9): 1694-1716.

[44]

Steiger E, Westerholt R, Resch B, Zipf A. Twitter as an indicator for whereabouts of people? Correlating twitter with UK census data. Computers, Environment and Urban Systems, 2015, 54: 255-265.

[45]

Sutton J, Vos SC, Olson MK, Woods C, Cohen E, Gibson CB, Phillips NE, Studts JL Lung cancer messages on twitter: Content analysis and evaluation. Journal of the American College of Radiology, 2018, 15(1): 210-217.

[46]

Teh YW, Jordan MI, Beal MJ, Blei DM. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 2006, 101(476): 1566-1581.

[47]

Vijayarani S, Jothi P. Partitioning clustering algorithms for data stream outlier detection. International Journal of Innovative Research in Computer and Communication Engineering, 2014, 2(4): 3975-3981.

[48]

Walther, M., and M. Kaisser. 2013. Geo-spatial event detection in the twitter stream. In Proceedings of the 35th European conference on advances in information retrieval, ECIR 2013, 356–367, 24-27 March 2013, Moscow, Russia.

[49]

Wang Z, Ye X, Tsou M-H. Spatial, temporal, and content analysis of Twitter for wildfire hazards. Natural Hazards, 2016, 83(1): 523-540.

[50]

Yang W, Mu L. GIS analysis of depression among twitter users. Applied Geography, 2015, 60: 217-223.

AI Summary AI Mindmap
PDF

137

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/