|
Nasik Muhammad Nafi, Avishek Bose, Sarthak Khanal, Doina Caragea, & William H. Hsu. (2020). Abstractive Text Summarization of Disaster-Related Documents. In Amanda Hughes, Fiona McNeill, & Christopher W. Zobel (Eds.), ISCRAM 2020 Conference Proceedings – 17th International Conference on Information Systems for Crisis Response and Management (pp. 881–892). Blacksburg, VA (USA): Virginia Tech.
Abstract: Abstractive summarization is intended to capture key information from the full text of documents. In the application domain of disaster and crisis event reporting, key information includes disaster effects, cause, and severity. While some researches regarding information extraction in the disaster domain have focused on keyphrase extraction from short disaster-related texts like tweets, there is hardly any work that attempts abstractive summarization of long disaster-related documents. Following the recent success of Reinforcement Learning (RL) in other domains, we leverage an RL-based state-of-the-art approach in abstractive summarization to summarize disaster-related documents. RL enables an agent to find an optimal policy by maximizing some reward. We design a novel hybrid reward metric for the disaster domain by combining \underline{Vec}tor Similarity and \underline{Lex}icon Matching (\textit{VecLex}) to maximize the relevance of the abstract to the source document while focusing on disaster-related keywords. We evaluate the model on a disaster-related subset of a CNN/Daily Mail dataset consisting of 104,913 documents. The results show that our approach produces more informative summaries and achieves higher \textit{VecLex} scores compared to the baseline.
|
|