Salemi, H., Senarath, Y., & Purohit, H. (2023). A Comparative Study of Pre-trained Language Models to Filter Informative Code-mixed Data on Social Media during Disasters. In Jaziar Radianti, Ioannis Dokas, Nicolas Lalone, & Deepak Khazanchi (Eds.), Proceedings of the 20th International ISCRAM Conference (pp. 920–932). Omaha, USA: University of Nebraska at Omaha.
Abstract: Social media can inform response agencies during disasters to help affected people. However, filtering informative messages from social media content is challenging due to the ungrammatical text, out-of-vocabulary words, etc., that limit the context interpretation of messages. Further, there has been limited exploration of the challenge of code-mixing (using words from another language in a given text of one language) in user-generated content during disasters. Hence, we proposed a new code-mixed dataset of tweets related to the 2017 Iran-Iraq Earthquake and annotated them based on their informativeness characteristics. Additionally, we have evaluated the performance of state-of-the-art pre-trained language models: mBERT, RoBERTa, and XLM-R, on the proposed dataset. The results show that mBERT (with F1 score of 72%) overweighs the other models in classifying informative code-mixed messages. Moreover, we analyzed some patterns of exploiting code-mixing by users, which can help future works in developing these models.
|
|
Simon Mille, Gerard Casamayor, Jens Grivolla, Alexander Shvets, & Leo Wanner. (2022). Automatic Multilingual Incident Report Generation for Crisis Management. In Rob Grace, & Hossein Baharmand (Eds.), ISCRAM 2022 Conference Proceedings – 19th International Conference on Information Systems for Crisis Response and Management (pp. 299–309). Tarbes, France.
Abstract: Successful and effucient crisis management depends on the availability of all accessible relevant information on the incidents during a crisis. The sources of this information are very often multiple and manifold – in particular in the case of environmental crises such as wild fires, floods, drought, etc. For the staff of the control centres it can be a challenge to follow up on all of them. In this paper, we present work in progress on an automatic multilingual incident report generator that produces summaries of all environmental incidents communicated by citizens or authorities in a given time range for a given region in terms of a text message, an audio, a video or an image and analyzed by dedicated modules into uniform knowledge representation structures.
|
|
Andrea Zielinski, & Ulrich Bügel. (2012). Multilingual analysis of twitter news in support of mass emergency events. In Z.Franco J. R. L. Rothkrantz (Ed.), ISCRAM 2012 Conference Proceedings – 9th International Conference on Information Systems for Crisis Response and Management. Vancouver, BC: Simon Fraser University.
Abstract: Social media are increasingly becoming a source for event-based early warning systems in the sense that they can help to detect natural disasters and support crisis management during or after disasters. In this work-in-progress paper we study the problems of analyzing multilingual twitter feeds for emergency events. The present work focuses on English as “lingua franca” and on under-resourced Mediterranean languages in endangered zones, particularly Turkey, Greece, and Romania Generally, as local civil protection authorities and the population are likely to respond in their native language. We investigated ten earthquake events and defined four language-specific classifiers that can be used to detect earthquakes by filtering out irrelevant messages that do not relate to the event. The final goal is to extend this work to more Mediterranean languages and to classify and extract relevant information from tweets, translating the main keywords into English. Preliminary results indicate that such a filter has the potential to confirm forecast parameters of tsunami affecting coastal areas where no tide gauges exist and could be integrated into seismographic sensor networks. © 2012 ISCRAM.
|
|