Automatic Topic-Guided Segmentation of Holocaust Survivor Testimonies
Automatic Topic-Guided Segmentation of Holocaust Survivor Testimonies
In recent decades, efforts have been made to gather and digitize the testimonies of living Holocaust survivors. The challenge we now face is attending to those thousands of human stories, which while safely stored in archives, may nevertheless disappear into oblivion. Despite recent advances in narrative analysis in the fields of Computational Literature (CL) and Natural Language Processing (NLP), existing language model technology still faces challenges in analyzing elaborate narratives and long texts. One such challenge is text segmentation – a long-standing issue in the area of CL and NLP. In our work, we propose a computational method to approach this problem. Our research draws on testimony transcripts from the Shoah Foundation (SF) Holocaust archive for supervised topic classification, which is then used as topics guidance for automatic segmentation.

