|   | 
Details
   web
Records
Author (up) Hongmin Li; Doina Caragea; Cornelia Caragea
Title Combining Self-training with Deep Learning for Disaster Tweet Classification Type Conference Article
Year 2021 Publication ISCRAM 2021 Conference Proceedings – 18th International Conference on Information Systems for Crisis Response and Management Abbreviated Journal Iscram 2021
Volume Issue Pages 719-730
Keywords Domain Adaptation, Self-training, Crisis Tweets Classification, BERT, CNN
Abstract Significant progress has been made towards automated classification of disaster or crisis related tweets using machine learning approaches. Deep learning models, such as Convolutional Neural Networks (CNN), domain adaptation approaches based on self-training, and approaches based on pre-trained language models, such as BERT, have been proposed and used independently for disaster tweet classification. In this paper, we propose to combine self-training with CNN and BERT models, respectively, to improve the performance on the task of identifying crisis related tweets in a target disaster where labeled data is assumed to be unavailable, while unlabeled data is available. We evaluate the resulting self-training models on three crisis tweet collections and find that: 1) the pre-trained language model BERTweet is better than the standard BERT model, when fine-tuned for downstream crisis tweets classification; 2) self-training can help improve the performance of the CNN and BERTweet models for larger unlabeled target datasets, but not for smaller datasets.
Address Department of Computer Science, Kansas State University; Department of Computer Science, Kansas State University; Department of Computer Science, University of Illinois at Chicago
Corporate Author Thesis
Publisher Virginia Tech Place of Publication Blacksburg, VA (USA) Editor Anouck Adrot; Rob Grace; Kathleen Moore; Christopher W. Zobel
Language English Summary Language English Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 978-1-949373-61-5 ISBN Medium
Track Social Media for Disaster Response and Resilience Expedition Conference 18th International Conference on Information Systems for Crisis Response and Management
Notes hongminli@ksu.edu Approved no
Call Number ISCRAM @ idladmin @ Serial 2367
Share this record to Facebook
 

 
Author (up) Hongmin Li; Doina Caragea; Cornelia Caragea
Title Towards Practical Usage of a Domain Adaptation Algorithm in the Early Hours of a Disaster Type Conference Article
Year 2017 Publication Proceedings of the 14th International Conference on Information Systems for Crisis Response And Management Abbreviated Journal Iscram 2017
Volume Issue Pages 692-704
Keywords Twitter; Domain adaptation; Disaster; Classification
Abstract Many machine learning techniques have been proposed to reduce the information overload in social media data during an emergency situation. Among such techniques, domain adaptation approaches present greater potential as compared to supervised algorithms because they don't require labeled data from the current disaster for training. However, the use of domain adaptation approaches in practice is sporadic at best. One reason is that domain adaptation algorithms have parameters that need to be tuned using labeled data from the target disaster, which is presumably not available. To address this limitation, we perform a study on one domain adaptation approach with the goal of understanding how much source data is needed to obtain good performance in a practical situation, and what parameter values of the approach give overall good performance. The results of our study provide useful insights into the practical application of domain adaptation algorithms in real crisis situations.
Address Kansas State University; University of North Texas
Corporate Author Thesis
Publisher Iscram Place of Publication Albi, France Editor Tina Comes, F.B., Chihab Hanachi, Matthieu Lauras, Aurélie Montarnal, eds
Language English Summary Language English Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 2411-3387 ISBN Medium
Track Social Media Studies Expedition Conference 14th International Conference on Information Systems for Crisis Response And Management
Notes Approved no
Call Number ISCRAM @ idladmin @ Serial 2057
Share this record to Facebook
 

 
Author (up) Hongmin Li; Nicolais Guevara; Nic Herndon; Doina Caragea; Kishore Neppalli; Cornelia Caragea; Anna Squicciarini; Andrea H. Tapia
Title Twitter Mining for Disaster Response: A Domain Adaptation Approach Type Conference Article
Year 2015 Publication ISCRAM 2015 Conference Proceedings ? 12th International Conference on Information Systems for Crisis Response and Management Abbreviated Journal ISCRAM 2015
Volume Issue Pages
Keywords Disaster Response; domain adaptation; tweet classification
Abstract Microblogging data such as Twitter data contains valuable information that has the potential to help improve the speed, quality, and efficiency of disaster response. Machine learning can help with this by prioritizing the tweets with respect to various classification criteria. However, supervised learning algorithms require labeled data to learn accurate classifiers. Unfortunately, for a new disaster, labeled tweets are not easily available, while they are usually available for previous disasters. Furthermore, unlabeled tweets from the current disaster are accumulating fast. We study the usefulness of labeled data from a prior source disaster, together with unlabeled data from the current target disaster to learn domain adaptation classifiers for the target. Experimental results suggest that, for some tasks, source data itself can be useful for classifying target data. However, for tasks specific to a particular disaster, domain adaptation approaches that use target unlabeled data in addition to source labeled data are superior.
Address
Corporate Author Thesis
Publisher University of Agder (UiA) Place of Publication Kristiansand, Norway Editor L. Palen; M. Buscher; T. Comes; A. Hughes
Language English Summary Language English Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 2411-3387 ISBN 9788271177881 Medium
Track Social Media Studies Expedition Conference ISCRAM 2015 Conference Proceedings ? 12th International Conference on Information Systems for Crisis Response and Management
Notes Approved yes
Call Number Serial 1234
Share this record to Facebook
 

 
Author (up) Hongmin Li; Xukun Li; Doina Caragea; Cornelia Caragea
Title Comparison of Word Embeddings and Sentence Encodings for Generalized Representations in Crisis Tweet Classifications Type Conference Article
Year 2018 Publication Proceedings of ISCRAM Asia Pacific 2018: Innovating for Resilience – 1st International Conference on Information Systems for Crisis Response and Management Asia Pacific. Abbreviated Journal Iscram Ap 2018
Volume Issue Pages 480-493
Keywords Word Embeddings, Sentence Encodings, Reduced Tweet Representation, Crisis Tweet Classification
Abstract Many machine learning and natural language processing techniques, including supervised and domain adaptation algorithms, have been proposed and studied in the context of filtering crisis tweets. However, applying these approaches in real-time is still challenging because of time-critical requirements of emergency response operations and also diversities and unique characteristics of emergency events. In this paper, we explore the idea of building “generalized” classifiers for filtering crisis tweets that can be pre-trained, and are thus ready to use in real-time, while generalizing well on future disasters/crises data. We propose to achieve this using simple feature based adaptation with tweet representations based on word embeddings and also sentence-level embeddings, representations which do not rely on unlabeled data to achieve domain adaptations and can be easily implemented. Given that there are different types of word/sentence embeddings that are widely used, we propose to compare them to get a general idea about which type works better with crisis tweets classification tasks. Our experimental results show that GloVe embeddings in general work better with the datasets used in our evaluation, and that the supervised algorithms used in our experiments benefit from GloVe embeddings trained specifically on crisis data. Furthermore, our experimental results show that following GloVe, the sentence embeddings have great potential in crisis tweet tasks.
Address Kansas State University; Kansas State University; Kansas State University; Kansas State University
Corporate Author Thesis
Publisher Massey Univeristy Place of Publication Albany, Auckland, New Zealand Editor Kristin Stock; Deborah Bunker
Language English Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Track Social Media and Community Engagement Supporting Resilience Building Expedition Conference
Notes Approved no
Call Number Serial 1689
Share this record to Facebook
 

 
Author (up) Reza Mazloom; HongMin Li; Doina Caragea; Muhammad Imran; Cornelia Caragea
Title Classification of Twitter Disaster Data Using a Hybrid Feature-Instance Adaptation Approach Type Conference Article
Year 2018 Publication ISCRAM 2018 Conference Proceedings – 15th International Conference on Information Systems for Crisis Response and Management Abbreviated Journal Iscram 2018
Volume Issue Pages 727-735
Keywords Tweet classification, Domain adaptation, Matrix factorization, k-Nearest Neighbors, Disaster response
Abstract Huge amounts of data that are generated on social media during emergency situations are regarded as troves of critical information. The use of supervised machine learning techniques in the early stages of a disaster is challenged by the lack of labeled data for that particular disaster. Furthermore, supervised models trained on labeled data from a prior disaster may not produce accurate results, given the inherent variation between the current and the prior disasters. To address the challenges posed by the lack of labeled data for a target disaster, we propose to use a hybrid feature-instance adaptation approach based on matrix factorization and the k nearest neighbors algorithm, respectively. The proposed hybrid adaptation approach is used to select a subset of the source disaster data that is representative for the target disaster. The selected subset is subsequently used to learn accurate Naive Bayes classifiers for the target disaster.
Address
Corporate Author Thesis
Publisher Rochester Institute of Technology Place of Publication Rochester, NY (USA) Editor Kees Boersma; Brian Tomaszeski
Language English Summary Language English Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 2411-3387 ISBN 978-0-692-12760-5 Medium
Track Social Media Studies Expedition Conference ISCRAM 2018 Conference Proceedings - 15th International Conference on Information Systems for Crisis Response and Management
Notes Approved no
Call Number Serial 2146
Share this record to Facebook