Dynamic Spatio-Temporal Tweet Mining for Event Detection: A Case Study of Hurricane Florence
Mahdi Farnaghi , Zeinab Ghaemi , Ali Mansourian
International Journal of Disaster Risk Science ›› 2020, Vol. 11 ›› Issue (3) : 378 -393.
Dynamic Spatio-Temporal Tweet Mining for Event Detection: A Case Study of Hurricane Florence
Extracting information about emerging events in large study areas through spatiotemporal and textual analysis of geotagged tweets provides the possibility of monitoring the current state of a disaster. This study proposes dynamic spatio-temporal tweet mining as a method for dynamic event extraction from geotagged tweets in large study areas. It introduces the use of a modified version of ordering points to identify the clustering structure to address the intrinsic heterogeneity of Twitter data. To precisely calculate the textual similarity, three state-of-the-art text embedding methods of Word2vec, GloVe, and FastText were used to capture both syntactic and semantic similarities. The impact of selected embedding algorithms on the quality of the outputs was studied. Different combinations of spatial and temporal distances with the textual similarity measure were investigated to improve the event detection outcomes. The proposed method was applied to a case study related to 2018 Hurricane Florence. The method was able to precisely identify events of varied sizes and densities before, during, and after the hurricane. The feasibility of the proposed method was qualitatively evaluated using the Silhouette coefficient and qualitatively discussed. The proposed method was also compared to an implementation based on the standard density-based spatial clustering of applications with noise algorithm, where it showed more promising results.
Disaster management / Hurricane Florence / Natural language processing / Spatio-temporal tweet analysis / Tweet clustering / Twitter
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
Cheng, T., and T. Wicks. 2014. Event detection using twitter: A spatio-temporal approach. PloS One 9(6): Article e97807. |
| [8] |
|
| [9] |
|
| [10] |
Ester, M., H.-P. Kriegel, J. Sander, and X. Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the international conference on knowledge discovery and data mining, 226–231, 2-4 August 1996, Portland, OR, USA. |
| [11] |
|
| [12] |
Fócil-Arias, C., J. Zúñiga, G. Sidorov, I. Batyrshin, and A. Gelbukh. 2017. A tweets classifier based on cosine similarity. Working notes of CLEF 2017—Conference and Labs of the Evaluation Forum, Dublin, Ireland, 11-14 September 2017. |
| [13] |
Ghaemi, Z., and M. Farnaghi. 2019. A varied density-based clustering approach for event detection from heterogeneous twitter data. ISPRS International Journal of Geo-Information 8(2): Article 82. |
| [14] |
|
| [15] |
|
| [16] |
Hecht, B., L. Hong, B. Suh, and E.H. Chi. 2011. Tweets from Justin Bieber’s heart: The dynamics of the “location” field in user profiles. In Proceedings of the ACM CHI annual conference on human factors in computing systems, 237–246, 7-12 May 2011, Vancouver, BC, Canada. |
| [17] |
|
| [18] |
Huang, Y., Y. Li, and J. Shan. 2018. Spatial-temporal event detection from geotagged tweets. ISPRS International Journal of Geo-Information 7(4): Article 150. |
| [19] |
Idrissi, A., H. Rehioui, A. Laghrissi, and S. Retal. 2015. An improvement of DENCLUE algorithm for the data clustering. In Proceedings of the 2015 5th International Conference on Information & Communication Technology and Accessibility (ICTA), 21-23 December 2015, Marrakech, Morocco. IEEE. https://doi.org/10.1109/icta.2015.7426936. |
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
Lee, K., D. Palsetia, R. Narayanan, M.M.A. Patwary, A. Agrawal, and A.N. Choudhary. 2011. Twitter trending topic classification. In Proceedings of the 11th IEEE international conference on data mining workshops, 251–258, 11 December 2011, Vancouver, BC, Canada. |
| [26] |
Liu, P., D. Zhou, and N. Wu. 2007. VDBSCAN: Varied density based spatial clustering of applications with noise. In Proceedings of the 2007 international conference on service systems and service management, 1-4, 9-11 June 2007, Chengdu, China. |
| [27] |
|
| [28] |
Mikolov, T., K. Chen, G. Corrado, and J. Dean. 2013. Efficient estimation of word representations in vector space. In Proceedings of the 1st international conference on learning representations, 1-12, 2-4 May 2013, Scottsdale, AZ, USA. |
| [29] |
|
| [30] |
Nguyen, M.D, and W.-Y. Shin. 2017. DBSTexC: Density-based spatio-textual clustering on twitter. In Proceedings of the 9th IEEE/ACM international conference on advances in social networks analysis and mining, 23–26, 31 July-3 August 2017, Sydney, Australia. |
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
Pennington, J., R. Socher, and C. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543, 25–29 October 2014, Doha, Qatar. |
| [35] |
Phelan, O., K. McCarthy, and B. Smyth. 2009. Using twitter to recommend real-time topical news. In Proceedings of the 2009 ACM conference on recommender systems, 385–388, 23-25 October 2009, New York, NY, USA. |
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
Schubert, E., and M. Gertz. 2018. Improving the cluster structure extracted from OPTICS plots. In Proceedings of the conference “lernen, wissen, daten, analysen”, 318–329, 22-24 August 2018, Mannheim, Germany. |
| [40] |
Schubert, E., J. Sander, M. Ester, H.P. Kriegel, and X. Xu. 2017. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. ACM Transactions on Database Systems 42(3): Article 19. |
| [41] |
|
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
|
| [46] |
|
| [47] |
|
| [48] |
Walther, M., and M. Kaisser. 2013. Geo-spatial event detection in the twitter stream. In Proceedings of the 35th European conference on advances in information retrieval, ECIR 2013, 356–367, 24-27 March 2013, Moscow, Russia. |
| [49] |
|
| [50] |
|
/
| 〈 |
|
〉 |