Anna Kruspe. (2020). Detecting Novelty in Social Media Messages During Emerging Crisis Events. In Amanda Hughes, Fiona McNeill, & Christopher W. Zobel (Eds.), ISCRAM 2020 Conference Proceedings – 17th International Conference on Information Systems for Crisis Response and Management (pp. 860–871). Blacksburg, VA (USA): Virginia Tech.
Abstract: Social media can be a highly valuable source of information during disasters. A crisis' development over time is of particular interest here, as social media messages can convey unfolding events in near-real time. Previous approaches for the automatic detection of information in such messages have focused on a static analysis, not taking temporal changes and already-known information into account. In this paper, we present a novel method for detecting new topics in incoming Twitter messages (tweets) conditional upon previously found related tweets. We do this by first extracting latent representations of each tweet using pre-trained sentence embedding models. Then, Infinite Mixture modeling is used to dynamically cluster these embeddings anew with each incoming tweet. Once a cluster reaches a minimum number of members, it is considered to be a new topic. We validate our approach on the TREC Incident Streams 2019A data set.
|
Grégoire Burel, & Harith Alani. (2018). Crisis Event Extraction Service (CREES) – Automatic Detection and Classification of Crisis-related Content on Social Media. In Kees Boersma, & Brian Tomaszeski (Eds.), ISCRAM 2018 Conference Proceedings – 15th International Conference on Information Systems for Crisis Response and Management (pp. 597–608). Rochester, NY (USA): Rochester Institute of Technology.
Abstract: Social media posts tend to provide valuable reports during crises. However, this information can be hidden in large amounts of unrelated documents. Providing tools that automatically identify relevant posts, event types (e.g., hurricane, floods, etc.) and information categories (e.g., reports on affected individuals, donations and volunteering, etc.) in social media posts is vital for their efficient handling and consumption. We introduce the Crisis Event Extraction Service (CREES), an open-source web API that automatically classifies posts during crisis situations. The API provides annotations for crisis-related documents, event types and information categories through an easily deployable and accessible web API that can be integrated into multiple platform and tools. The annotation service is backed by Convolutional Neural Networks (CNNs) and validated against traditional machine learning models. Results show that the CNN-based API results can be relied upon when dealing with specific crises with the benefits associated with the usage word embeddings.
|
Hongmin Li, Xukun Li, Doina Caragea, & Cornelia Caragea. (2018). Comparison of Word Embeddings and Sentence Encodings for Generalized Representations in Crisis Tweet Classifications. In Kristin Stock, & Deborah Bunker (Eds.), Proceedings of ISCRAM Asia Pacific 2018: Innovating for Resilience – 1st International Conference on Information Systems for Crisis Response and Management Asia Pacific. (pp. 480–493). Albany, Auckland, New Zealand: Massey Univeristy.
Abstract: Many machine learning and natural language processing techniques, including supervised and domain adaptation algorithms, have been proposed and studied in the context of filtering crisis tweets. However, applying these approaches in real-time is still challenging because of time-critical requirements of emergency response operations and also diversities and unique characteristics of emergency events. In this paper, we explore the idea of building “generalized” classifiers for filtering crisis tweets that can be pre-trained, and are thus ready to use in real-time, while generalizing well on future disasters/crises data. We propose to achieve this using simple feature based adaptation with tweet representations based on word embeddings and also sentence-level embeddings, representations which do not rely on unlabeled data to achieve domain adaptations and can be easily implemented. Given that there are different types of word/sentence embeddings that are widely used, we propose to compare them to get a general idea about which type works better with crisis tweets classification tasks. Our experimental results show that GloVe embeddings in general work better with the datasets used in our evaluation, and that the supervised algorithms used in our experiments benefit from GloVe embeddings trained specifically on crisis data. Furthermore, our experimental results show that following GloVe, the sentence embeddings have great potential in crisis tweet tasks.
|
Justin Michael Crow. (2020). Verifying Baselines for Crisis Event Information Classification on Twitter. In Amanda Hughes, Fiona McNeill, & Christopher W. Zobel (Eds.), ISCRAM 2020 Conference Proceedings – 17th International Conference on Information Systems for Crisis Response and Management (pp. 670–687). Blacksburg, VA (USA): Virginia Tech.
Abstract: Social media are rich information sources during crisis events such as earthquakes and terrorist attacks. Despite myriad challenges, with the right tools, significant insight can be gained to assist emergency responders and related applications. However, most extant approaches are incomparable, using bespoke definitions, models, datasets and even evaluation metrics. Furthermore, it's rare that code, trained models, or exhaustive parametrisation details are openly available. Thus, even confirming self-reported performance is problematic; authoritatively determining state of the art (SOTA) is essentially impossible. Consequently, to begin addressing such endemic ambiguity, this paper makes 3 contributions: 1) replication and results confirmation of a leading technique; 2) testing straightforward modifications likely to improve performance; and 3) extension to a novel complimentary type of crisis-relevant information to demonstrate it's generalisability.
|
Kiran Zahra, Rahul Deb Das, Frank O. Ostermann, & Ross S. Purves. (2022). Towards an Automated Information Extraction Model from Twitter Threads during Disasters. In Rob Grace, & Hossein Baharmand (Eds.), ISCRAM 2022 Conference Proceedings – 19th International Conference on Information Systems for Crisis Response and Management (pp. 637–653). Tarbes, France.
Abstract: Social media plays a vital role as a communication source during large-scale disasters. The unstructured and informal nature of such short individual posts makes it difficult to extract useful information, often due to a lack of additional context. The potential of social media threads– sequences of posts– has not been explored as a source of adding context and more information to the initiating post. In this research, we explored Twitter threads as an information source and developed an information extraction model capable of extracting relevant information from threads posted during disasters. We used a crowdsourcing platform to determine whether a thread adds more information to the initial tweet and defined disaster-related information present in these threads into six themes– event reporting, location, time, intensity, casualty and damage reports, and help calls. For these themes, we created the respective thematic lexicons from WordNet. Moreover, we developed and compared four information extraction models trained on GloVe, word2vec, bag-of-words, and thematic bag-of-words to extract and summarize the most critical information from the threads. Our results reveal that 70 percent of all threads add information to the initiating post for various disaster-related themes. Furthermore, the thematic bag-of-words information extraction model outperforms the other algorithms and models for preserving the highest number of disaster-related themes.
|