|
Hongmin Li, Xukun Li, Doina Caragea, & Cornelia Caragea. (2018). Comparison of Word Embeddings and Sentence Encodings for Generalized Representations in Crisis Tweet Classifications. In Kristin Stock, & Deborah Bunker (Eds.), Proceedings of ISCRAM Asia Pacific 2018: Innovating for Resilience – 1st International Conference on Information Systems for Crisis Response and Management Asia Pacific. (pp. 480–493). Albany, Auckland, New Zealand: Massey Univeristy.
Abstract: Many machine learning and natural language processing techniques, including supervised and domain adaptation algorithms, have been proposed and studied in the context of filtering crisis tweets. However, applying these approaches in real-time is still challenging because of time-critical requirements of emergency response operations and also diversities and unique characteristics of emergency events. In this paper, we explore the idea of building “generalized” classifiers for filtering crisis tweets that can be pre-trained, and are thus ready to use in real-time, while generalizing well on future disasters/crises data. We propose to achieve this using simple feature based adaptation with tweet representations based on word embeddings and also sentence-level embeddings, representations which do not rely on unlabeled data to achieve domain adaptations and can be easily implemented. Given that there are different types of word/sentence embeddings that are widely used, we propose to compare them to get a general idea about which type works better with crisis tweets classification tasks. Our experimental results show that GloVe embeddings in general work better with the datasets used in our evaluation, and that the supervised algorithms used in our experiments benefit from GloVe embeddings trained specifically on crisis data. Furthermore, our experimental results show that following GloVe, the sentence embeddings have great potential in crisis tweet tasks.
|
|
|
Xukun Li, & Doina Caragea. (2020). Improving Disaster-related Tweet Classification with a Multimodal Approach. In Amanda Hughes, Fiona McNeill, & Christopher W. Zobel (Eds.), ISCRAM 2020 Conference Proceedings – 17th International Conference on Information Systems for Crisis Response and Management (pp. 893–902). Blacksburg, VA (USA): Virginia Tech.
Abstract: Social media data analysis is important for disaster management. Lots of prior studies have focused on classifying a tweet based on its text or based on its images, independently, even if the tweet contains both text and images. Under the assumptions that text and images may contain complementary information, it is of interest to construct classifiers that make use of both modalities of the tweet. Towards this goal, we propose a multimodal classification model which aggregates text and image information. Our study aims to provide insights into the benefits obtained by combining text and images, and to understand what type of modality is more informative with respect to disaster tweet classification. Experimental results show that both text and image classification can be improved by the multimodal approach.
|
|
|
Xukun Li, Doina Caragea, Cornelia Caragea, Muhammad Imran, & Ferda Ofli. (2019). Identifying Disaster Damage Images Using a Domain Adaptation Approach. In Z. Franco, J. J. González, & J. H. Canós (Eds.), Proceedings of the 16th International Conference on Information Systems for Crisis Response And Management. Valencia, Spain: Iscram.
Abstract: Approaches for effectively filtering useful situational awareness information posted by eyewitnesses of disasters,
in real time, are greatly needed. While many studies have focused on filtering textual information, the research
on filtering disaster images is more limited. In particular, there are no studies on the applicability of domain
adaptation to filter images from an emergent target disaster, when no labeled data is available for the target disaster.
To fill in this gap, we propose to apply a domain adaptation approach, called domain adversarial neural networks
(DANN), to the task of identifying images that show damage. The DANN approach has VGG-19 as its backbone,
and uses the adversarial training to find a transformation that makes the source and target data indistinguishable.
Experimental results on several pairs of disasters suggest that the DANN model generally gives similar or better
results as compared to the VGG-19 model fine-tuned on the source labeled data.
|
|