|
Ahmed Abdeltawab Abdelgawad, & Tina Comes. (2019). Evaluation Framework for the iTRACK Integrated System. In Z. Franco, J. J. González, & J. H. Canós (Eds.), Proceedings of the 16th International Conference on Information Systems for Crisis Response And Management. Valencia, Spain: Iscram.
Abstract: Evaluation and testing are major steps in the development of any information system, particularly if it is to be used in high-risk contexts such as conflicts. While thus far there are various approaches for testing against technology requirements; usability or usefulness, there is a lack of a comprehensive evaluation framework that combines the three elements. The lack of such a framework and commonly agreed standards constitutes a barrier for innovation, and at the same time imposes risks to responders if the technology is introduced without proper testing. This paper aims to close this gap. Based on a reviewing of evaluation methods and measurement metrics, we design a comprehensive evaluation framework including common code quality testing metrics, usability testing methods, subjective usefulness questionnaires, and performance indicators. We demonstrate our approach by using the example of an integrated system for the safety and security of humanitarian missions, and we highlight how our approach allows measuring the system?s quality and usefulness.
|
|
|
Matti Wiegmann, Jens Kersten, Friederike Klan, Martin Potthast, & Benno Stein. (2020). Analysis of Detection Models for Disaster-Related Tweets. In Amanda Hughes, Fiona McNeill, & Christopher W. Zobel (Eds.), ISCRAM 2020 Conference Proceedings – 17th International Conference on Information Systems for Crisis Response and Management (pp. 872–880). Blacksburg, VA (USA): Virginia Tech.
Abstract: Social media is perceived as a rich resource for disaster management and relief efforts, but the high class imbalance between disaster-related and non-disaster-related messages challenges a reliable detection. We analyze and compare the effectiveness of three state-of-the-art machine learning models for detecting disaster-related tweets. In this regard we introduce the Disaster Tweet Corpus~2020, an extended compilation of existing resources, which comprises a total of 123,166 tweets from 46~disasters covering 9~disaster types. Our findings from a large experiments series include: detection models work equally well over a broad range of disaster types when being trained for the respective type, a domain transfer across disaster types leads to unacceptable performance drops, or, similarly, type-agnostic classification models behave more robust at a lower effectiveness level. Altogether, the average misclassification rate of~3,8\% on performance-optimized detection models indicates effective classification knowledge but comes at the price of insufficient generalizability.
|
|
|
Thomas Münzberg, Marcus Wiens, & Frank Schultmann. (2014). A strategy evaluation framework based on dynamic vulnerability assessments. In and P.C. Shih. L. Plotnick M. S. P. S.R. Hiltz (Ed.), ISCRAM 2014 Conference Proceedings – 11th International Conference on Information Systems for Crisis Response and Management (pp. 45–54). University Park, PA: The Pennsylvania State University.
Abstract: Assessing a system's vulnerability is a widely used method to estimate the effects of risks. In the past years, increasingly dynamic vulnerability assessments were developed to display changes in vulnerability over time (e.g. in climate change, coastal vulnerability, and flood management). This implies that the dynamic influences of management strategies on vulnerability need to be considered in the selection and implementation of strategies. For this purpose, we present a strategy evaluation framework which is based on dynamic vulnerability assessments. The key contribution reported in this paper is an evaluation framework that considers how well strategies achieve a predefined target level of protection over time. Protection Target Levels are predefined objectives. The framework proposed is inspired by Goal Programming methods and allows distinguishing the relevance of time-dependent achievements by weights. This enables decision-makers to evaluate the overall performance of strategies, to test strategies, and to compare the outcome of strategies.
|
|