Records |
Author  |
Jens Kersten; Jan Bongard; Friederike Klan |
Title |
Gaussian Processes for One-class and Binary Classification of Crisis-related Tweets |
Type |
Conference Article |
Year |
2022 |
Publication |
ISCRAM 2022 Conference Proceedings – 19th International Conference on Information Systems for Crisis Response and Management |
Abbreviated Journal |
Iscram 2022 |
Volume |
|
Issue |
|
Pages |
664-673 |
Keywords |
Gaussian Process; One-class Classification; Twitter; Overload Reduction; Crisis Informatics |
Abstract |
Overload reduction is essential to exploit Twitter text data for crisis management. Often used pre-trained machine learning models require training data for both, crisis-related and off-topic content. However, this task can also be formulated as a one-class classification problem in which labeled off-topic samples are not required. Gaussian processes (GPs) have great potential in both, binary and one-class settings and are therefore investigated in this work. Deep kernel learning combines the representative power of text embeddings with the Bayesian formalism of GPs. Motivated by this, we investigate the potential of deep kernel models for the task of classifying crisis-related tweet texts with special emphasis on cross-event applications. Compared to standard binary neural networks, first experiments with one-class GP models reveal a great potential for realistic scenarios, offering a fast and flexible approach for interactive model training without requiring off-topic training samples and comprehensive expert knowledge (only two model parameters involved). |
Address |
German Aerospace Center– Jena, Germany; German Aerospace Center– Jena, Germany; German Aerospace Center– Jena, Germany |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
Tarbes, France |
Editor |
Rob Grace; Hossein Baharmand |
Language |
English |
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
2411-3387 |
ISBN |
978-82-8427-099-9 |
Medium |
|
Track |
Social Media for Crisis Management |
Expedition |
|
Conference |
|
Notes |
|
Approved |
no |
Call Number |
ISCRAM @ idladmin @ |
Serial |
2446 |
Share this record to Facebook |
|
|
|
Author  |
Jens Kersten; Jan Bongard; Friederike Klan |
Title |
Combining Supervised and Unsupervised Learning to Detect and Semantically Aggregate Crisis-Related Twitter Content |
Type |
Conference Article |
Year |
2021 |
Publication |
ISCRAM 2021 Conference Proceedings – 18th International Conference on Information Systems for Crisis Response and Management |
Abbreviated Journal |
Iscram 2021 |
Volume |
|
Issue |
|
Pages |
744-754 |
Keywords |
Information Overload Reduction, Semantic Clustering, Crisis Informatics, Twitter Stream |
Abstract |
Twitter is an immediate and almost ubiquitous platform and therefore can be a valuable source of information during disasters. Current methods for identifying and classifying crisis-related content are often based on single tweets, i.e., already known information from the past is neglected. In this paper, the combination of tweet-wise pre-trained neural networks and unsupervised semantic clustering is proposed and investigated. The intention is to (1) enhance the generalization capability of pre-trained models, (2) to be able to handle massive amounts of stream data, (3) to reduce information overload by identifying potentially crisis-related content, and (4) to obtain a semantically aggregated data representation that allows for further automated, manual and visual analyses. Latent representations of each tweet based on pre-trained sentence embedding models are used for both, clustering and tweet classification. For a fast, robust and time-continuous processing, subsequent time periods are clustered individually according to a Chinese restaurant process. Clusters without any tweet classified as crisis-related are pruned. Data aggregation over time is ensured by merging semantically similar clusters. A comparison of our hybrid method to a similar clustering approach, as well as first quantitative and qualitative results from experiments with two different labeled data sets demonstrate the great potential for crisis-related Twitter stream analyses. |
Address |
German Aerospace Center (DLR), Institute of Data Science, Citizen Science Department; German Aerospace Center (DLR), Institute of Data Science, Citizen Science Department; German Aerospace Center (DLR), Institute of Data Science, Citizen Science Departmen |
Corporate Author |
|
Thesis |
|
Publisher |
Virginia Tech |
Place of Publication |
Blacksburg, VA (USA) |
Editor |
Anouck Adrot; Rob Grace; Kathleen Moore; Christopher W. Zobel |
Language |
English |
Summary Language |
English |
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
978-1-949373-61-5 |
ISBN |
|
Medium |
|
Track |
Social Media for Disaster Response and Resilience |
Expedition |
|
Conference |
18th International Conference on Information Systems for Crisis Response and Management |
Notes |
jens.kersten@dlr.de |
Approved |
no |
Call Number |
ISCRAM @ idladmin @ |
Serial |
2369 |
Share this record to Facebook |