|
Jens Kersten, Anna Kruspe, Matti Wiegmann, & Friederike Klan. (2019). Robust filtering of crisis-related tweets. In Z. Franco, J. J. González, & J. H. Canós (Eds.), Proceedings of the 16th International Conference on Information Systems for Crisis Response And Management. Valencia, Spain: Iscram.
Abstract: Social media enables fast information exchange and status reporting during crises. Filtering is usually required to
identify the small fraction of social media stream data related to events. Since deep learning has recently shown to
be a reliable approach for filtering and analyzing Twitter messages, a Convolutional Neural Network is examined for
filtering crisis-related tweets in this work. The goal is to understand how to obtain accurate and robust filtering
models and how model accuracies tend to behave in case of new events. In contrast to other works, the application
to real data streams is also investigated. Motivated by the observation that machine learning model accuracies
highly depend on the used data, a new comprehensive and balanced compilation of existing data sets is proposed.
Experimental results with this data set provide valuable insights. Preliminary results from filtering a data stream
recorded during hurricane Florence in September 2018 confirm our results.
|
|
|
Matti Wiegmann, Jens Kersten, Friederike Klan, Martin Potthast, & Benno Stein. (2020). Analysis of Detection Models for Disaster-Related Tweets. In Amanda Hughes, Fiona McNeill, & Christopher W. Zobel (Eds.), ISCRAM 2020 Conference Proceedings – 17th International Conference on Information Systems for Crisis Response and Management (pp. 872–880). Blacksburg, VA (USA): Virginia Tech.
Abstract: Social media is perceived as a rich resource for disaster management and relief efforts, but the high class imbalance between disaster-related and non-disaster-related messages challenges a reliable detection. We analyze and compare the effectiveness of three state-of-the-art machine learning models for detecting disaster-related tweets. In this regard we introduce the Disaster Tweet Corpus~2020, an extended compilation of existing resources, which comprises a total of 123,166 tweets from 46~disasters covering 9~disaster types. Our findings from a large experiments series include: detection models work equally well over a broad range of disaster types when being trained for the respective type, a domain transfer across disaster types leads to unacceptable performance drops, or, similarly, type-agnostic classification models behave more robust at a lower effectiveness level. Altogether, the average misclassification rate of~3,8\% on performance-optimized detection models indicates effective classification knowledge but comes at the price of insufficient generalizability.
|
|