|
Record |
Links |
|
Author |
Jens Kersten; Jan Bongard; Friederike Klan |
|
|
Title |
Combining Supervised and Unsupervised Learning to Detect and Semantically Aggregate Crisis-Related Twitter Content |
Type |
Conference Article |
|
Year |
2021 |
Publication |
ISCRAM 2021 Conference Proceedings – 18th International Conference on Information Systems for Crisis Response and Management |
Abbreviated Journal |
Iscram 2021 |
|
|
Volume |
|
Issue |
|
Pages |
744-754 |
|
|
Keywords |
Information Overload Reduction, Semantic Clustering, Crisis Informatics, Twitter Stream |
|
|
Abstract |
Twitter is an immediate and almost ubiquitous platform and therefore can be a valuable source of information during disasters. Current methods for identifying and classifying crisis-related content are often based on single tweets, i.e., already known information from the past is neglected. In this paper, the combination of tweet-wise pre-trained neural networks and unsupervised semantic clustering is proposed and investigated. The intention is to (1) enhance the generalization capability of pre-trained models, (2) to be able to handle massive amounts of stream data, (3) to reduce information overload by identifying potentially crisis-related content, and (4) to obtain a semantically aggregated data representation that allows for further automated, manual and visual analyses. Latent representations of each tweet based on pre-trained sentence embedding models are used for both, clustering and tweet classification. For a fast, robust and time-continuous processing, subsequent time periods are clustered individually according to a Chinese restaurant process. Clusters without any tweet classified as crisis-related are pruned. Data aggregation over time is ensured by merging semantically similar clusters. A comparison of our hybrid method to a similar clustering approach, as well as first quantitative and qualitative results from experiments with two different labeled data sets demonstrate the great potential for crisis-related Twitter stream analyses. |
|
|
Address |
German Aerospace Center (DLR), Institute of Data Science, Citizen Science Department; German Aerospace Center (DLR), Institute of Data Science, Citizen Science Department; German Aerospace Center (DLR), Institute of Data Science, Citizen Science Departmen |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Virginia Tech |
Place of Publication |
Blacksburg, VA (USA) |
Editor |
Anouck Adrot; Rob Grace; Kathleen Moore; Christopher W. Zobel |
|
|
Language |
English |
Summary Language |
English |
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
978-1-949373-61-5 |
ISBN |
|
Medium |
|
|
|
Track |
Social Media for Disaster Response and Resilience |
Expedition |
|
Conference |
18th International Conference on Information Systems for Crisis Response and Management |
|
|
Notes |
jens.kersten@dlr.de |
Approved |
no |
|
|
Call Number |
ISCRAM @ idladmin @ |
Serial |
2369 |
|
Share this record to Facebook |