Kiran Zahra, Rahul Deb Das, Frank O. Ostermann, & Ross S. Purves. (2022). Towards an Automated Information Extraction Model from Twitter Threads during Disasters. In Rob Grace, & Hossein Baharmand (Eds.), ISCRAM 2022 Conference Proceedings – 19th International Conference on Information Systems for Crisis Response and Management (pp. 637–653). Tarbes, France.
Abstract: Social media plays a vital role as a communication source during large-scale disasters. The unstructured and informal nature of such short individual posts makes it difficult to extract useful information, often due to a lack of additional context. The potential of social media threads– sequences of posts– has not been explored as a source of adding context and more information to the initiating post. In this research, we explored Twitter threads as an information source and developed an information extraction model capable of extracting relevant information from threads posted during disasters. We used a crowdsourcing platform to determine whether a thread adds more information to the initial tweet and defined disaster-related information present in these threads into six themes– event reporting, location, time, intensity, casualty and damage reports, and help calls. For these themes, we created the respective thematic lexicons from WordNet. Moreover, we developed and compared four information extraction models trained on GloVe, word2vec, bag-of-words, and thematic bag-of-words to extract and summarize the most critical information from the threads. Our results reveal that 70 percent of all threads add information to the initiating post for various disaster-related themes. Furthermore, the thematic bag-of-words information extraction model outperforms the other algorithms and models for preserving the highest number of disaster-related themes.
|