Cody Buntain, Richard Mccreadie, & Ian Soboroff. (2022). Incident Streams 2021 Off the Deep End: Deeper Annotations and Evaluations in Twitter. In Rob Grace, & Hossein Baharmand (Eds.), ISCRAM 2022 Conference Proceedings – 19th International Conference on Information Systems for Crisis Response and Management (pp. 584–604). Tarbes, France.
Abstract: This paper summarizes the final year of the four-year Text REtrieval Conference Incident Streams track (TREC-IS), which has produced a large dataset comprising 136,263 annotated tweets, spanning 98 crisis events. Goals of this final year were twofold: 1) to add new categories for assessing messages, with a focus on characterizing the audience, author, and images associated with these messages, and 2) to enlarge the TREC-IS dataset with new events, with an emphasis of deeper pools for sampling. Beyond these two goals, TREC-IS has nearly doubled the number of annotated messages per event for the 26 crises introduced in 2021 and has released a new parallel dataset of 312,546 images associated with crisis content – with 7,297 tweets having annotations about their embedded images. Our analyses of this new crisis data yields new insights about the context of a tweet; e.g., messages intended for a local audience and those that contain images of weather forecasts and infographics have higher than average assessments of priority but are relatively rare. Tweets containing images, however, have higher perceived priorities than tweets without images. Moving to deeper pools, while tending to lower classification performance, also does not generally impact performance rankings or alter distributions of information-types. We end this paper with a discussion of these datasets, analyses, their implications, and how they contribute both new data and insights to the broader crisis informatics community.
|
Cody Buntain, Richard Mccreadie, & Ian Soboroff. (2021). Incident Streams 2020: TRECIS in the Time of COVID-19. In Anouck Adrot, Rob Grace, Kathleen Moore, & Christopher W. Zobel (Eds.), ISCRAM 2021 Conference Proceedings – 18th International Conference on Information Systems for Crisis Response and Management (pp. 621–639). Blacksburg, VA (USA): Virginia Tech.
Abstract: Between 2018 and 2019, the Incident Streams track (TREC-IS) has developed standard approaches for classifying the types and criticality of information shared in online social spaces during crises, but the introduction of SARS-CoV-2 has shifted the landscape of online crises substantially. While prior editions of TREC-IS have lacked data on large-scale public-health emergencies as these events are exceedingly rare, COVID-19 has introduced an over-abundance of potential data, and significant open questions remain about how existing approaches to crisis informatics and datasets built on other emergencies adapt to this new context. This paper describes how the 2020 edition of TREC-IS has addressed these dual issues by introducing a new COVID-19-specific task for evaluating generalization of existing COVID-19 annotation and system performance to this new context, applied to 11 regions across the globe. TREC-IS has also continued expanding its set of target crises, adding 29 new events and expanding the collection of event types to include explosions, fires, and general storms, making for a total of 9 event types in addition to the new COVID-19 events. Across these events, TREC-IS has made available 478,110 COVID-related messages and 282,444 crisis-related messages for participant systems to analyze, of which 14,835 COVID-related and 19,784 crisis-related messages have been manually annotated. Analyses of these new datasets and participant systems demonstrate first that both the distributions of information type and priority of information vary between general crises and COVID-19-related discussion. Secondly, despite these differences, results suggest leveraging general crisis data in the COVID-19 context improves performance over baselines. Using these results, we provide guidance on which information types appear most consistent between general crises and COVID-19.
|
Pooneh Mousavi, & Cody Buntain. (2022). “Please Donate for the Affected”: Supporting Emergency Managers in Finding Volunteers and Donations in Twitter Across Disasters. In Rob Grace, & Hossein Baharmand (Eds.), ISCRAM 2022 Conference Proceedings – 19th International Conference on Information Systems for Crisis Response and Management (pp. 605–622). Tarbes, France.
Abstract: Despite the outpouring of social support posted to social media channels in the aftermath of disaster, finding and managing content that can translate into community relief, donations, volunteering, or other recovery support is difficult due to the lack of sufficient annotated data around volunteerism. This paper outlines three experiments to alleviate these difficulties. First, we estimate to what degree volunteerism content from one crisis is transferable to another by evaluating the consistency of language in volunteer-and donation-related social media content across 78 disasters. Second it introduces methods for providing computational support in this emergency support function and developing semi-automated models for classifying volunteer-and donation-related social media content in new disaster events. Results show volunteer-and donation-related social media content is sufficiently similar across disasters and disaster types to warrant transferring models across disasters, and we evaluate simple resampling techniques for tuning these models. We then introduce and evaluate a weak-supervision approach to integrate domain knowledge from emergency response officers with machine learningmodelstoimproveclassification accuracy andacceleratethisemergencysupportinnewevents. This method helps to overcome the scarcity in data that we observe related to volunteer-and donation-related social media content.
|
Richard McCreadie, Cody Buntain, & Ian Soboroff. (2020). Incident Streams 2019: Actionable Insights and How to Find Them. In Amanda Hughes, Fiona McNeill, & Christopher W. Zobel (Eds.), ISCRAM 2020 Conference Proceedings – 17th International Conference on Information Systems for Crisis Response and Management (pp. 744–760). Blacksburg, VA (USA): Virginia Tech.
Abstract: The ubiquity of mobile internet-enabled devices combined with wide-spread social media use during emergencies is posing new challenges for response personnel. In particular, service operators are now expected to monitor these online channels to extract actionable insights and answer questions from the public. A lack of adequate tools makes this monitoring impractical at the scale of many emergencies. The TREC Incident Streams (TREC-IS) track drives research into solving this technology gap by bringing together academia and industry to develop techniques for extracting actionable insights from social media streams during emergencies. This paper covers the second year of TREC-IS, hosted in 2019 with two editions, 2019-A and 2019-B, contributing 12 new events and approximately 20,000 new tweets across 25 information categories, with 15 research groups participating across the world. This paper provides an overview of these new editions, actionable insights from data labelling, and the automated techniques employed by participant systems that appear most effective.
|
Richard McCreadie, Cody Buntain, & Ian Soboroff. (2019). TREC Incident Streams: Finding Actionable Information on Social Media. In Z. Franco, J. J. González, & J. H. Canós (Eds.), Proceedings of the 16th International Conference on Information Systems for Crisis Response And Management. Valencia, Spain: Iscram.
Abstract: The Text Retrieval Conference (TREC) Incident Streams track is a new initiative that aims to mature social
media-based emergency response technology. This initiative advances the state of the art in this area through an
evaluation challenge, which attracts researchers and developers from across the globe. The 2018 edition of the track
provides a standardized evaluation methodology, an ontology of emergency-relevant social media information types,
proposes a scale for information criticality, and releases a dataset containing fifteen test events and approximately
20,000 labeled tweets. Analysis of this dataset reveals a significant amount of actionable information on social
media during emergencies (> 10%). While this data is valuable for emergency response efforts, analysis of the
39 state-of-the-art systems demonstrate a performance gap in identifying this data. We therefore find the current
state-of-the-art is insufficient for emergency responders? requirements, particularly for rare actionable information
for which there is little prior training data available.
|
Shivam Sharma, & Cody Buntain. (2022). Bang for your Buck: Performance Impact Across Choices in Learning Architectures for Crisis Informatics. In Rob Grace, & Hossein Baharmand (Eds.), ISCRAM 2022 Conference Proceedings – 19th International Conference on Information Systems for Crisis Response and Management (pp. 719–736). Tarbes, France.
Abstract: Over the years, with the increase in social media engagement, there has been an in increase in various pipelines to analyze, classify and prioritize crisis-related data on various social media platforms. These pipelines utilize various data augmentation methods to counter imbalanced crisis data, sophisticated and off-the-shelf models for training. However, there is a lack of comprehensive study which compares these methods for the various sections of a pipeline. In this study, we split a general crisis-related pipeline into 3 major sections, namely, data augmentation, model selection, and training methodology. We compare various methods for each of these sections and then present a comprehensive evaluation of which section to prioritize based on the results from various pipelines. We compare our results against two separate tasks, information classification and priority scoring for crisis-related tweets. Our results suggest that data augmentation, in general,improves the performance. However, sophisticated, state-of-the-art language models like DeBERTa only show performance gain in information classification tasks, and models like RoBERTa tend to show a consistent performance increase over our presented baseline consisting of BERT. We also show that, though training two separate task-specific BERT models does show better performance than one BERT model with multi-task learning methodology over an imbalanced dataset, multi-task learning does improve performance for more sophisticated model like DeBERTa with a much more balanced dataset after augmentation.
|
Shivam Sharma, & Cody Buntain. (2021). An Evaluation of Twitter Datasets from Non-Pandemic Crises Applied to Regional COVID-19 Contexts. In Anouck Adrot, Rob Grace, Kathleen Moore, & Christopher W. Zobel (Eds.), ISCRAM 2021 Conference Proceedings – 18th International Conference on Information Systems for Crisis Response and Management (pp. 808–815). Blacksburg, VA (USA): Virginia Tech.
Abstract: In 2020, we have witnessed an unprecedented crisis event, the COVID-19 pandemic. Various questions arise regarding the nature of this crisis data and the impacts it would have on the existing tools. In this paper, we aim to study whether we can include pandemic-type crisis events with general non-pandemic events and hypothesize that including labeled crisis data from a variety of non-pandemic events will improve classification performance over models trained solely on pandemic events. To test our hypothesis we study the model performance for different models by performing a cross validation test on pandemic only held-out sets for two different types of training sets, one containing only pandemic data and the other a combination of pandemic and non-pandemic crisis data, and comparing the results of the two. Our results approve our hypothesis and give evidence of some crucial information propagation upon inclusion of non-pandemic crisis data to pandemic data.
|
Valerio Lorini, Carlos Castillo, Steve Peterson, Paola Rufolo, Hemant Purohit, Diego Pajarito, et al. (2021). Social Media for Emergency Management: Opportunities and Challenges at the Intersection of Research and Practice. In Anouck Adrot, Rob Grace, Kathleen Moore, & Christopher W. Zobel (Eds.), ISCRAM 2021 Conference Proceedings – 18th International Conference on Information Systems for Crisis Response and Management (pp. 772–777). Blacksburg, VA (USA): Virginia Tech.
Abstract: This paper summarizes key opportunities and challenges identified during the workshop “Social Media for Disaster Risk Management: Researchers Meet Practitioners” which took place online in November 2020. It constitutes a work-in-progress towards identifying new directions for research and development of systems that can better serve the information needs of emergency managers. Practitioners widely recognize the potential of accessing timely information from social media. Nevertheless, the discussion outlined some critical challenges for improving its adoption during crises. In particular, validating such information and integrating it with authoritative information and into more traditional information systems for emergency managers requires further work, and the negative impacts of misinformation and disinformation need to be prevented.
|