Du, Keli ; Dudar, Julia ; Schöch, Christof (2023)
Evaluation of Measures of Distinctiveness. Classification of Literary Texts on the Basis of Distinctive Words.
In: Journal of Computational Literary Studies, 2022, 1 (1)
doi: 10.26083/tuprints-00023252
Article, Secondary publication, Publisher's Version
Text
jcls-102-du.pdf Copyright Information: CC BY 4.0 International - Creative Commons, Attribution. Download (630kB) |
|
Text
jcls-102-du.xml Copyright Information: CC BY 4.0 International - Creative Commons, Attribution. Download (93kB) |
Item Type: | Article |
---|---|
Type of entry: | Secondary publication |
Title: | Evaluation of Measures of Distinctiveness. Classification of Literary Texts on the Basis of Distinctive Words |
Language: | English |
Date: | 21 February 2023 |
Place of Publication: | Darmstadt |
Year of primary publication: | 2022 |
Journal or Publication Title: | Journal of Computational Literary Studies |
Volume of the journal: | 1 |
Issue Number: | 1 |
Collation: | 21 Seiten |
DOI: | 10.26083/tuprints-00023252 |
Corresponding Links: | |
Origin: | Secondary publication from TUjournals |
Abstract: | This paper concerns an empirical evaluation of nine different measures of distinctiveness or ‘keyness’ in the context of Computational Literary Studies. We use nine different sets of literary texts (specifically, novels) written in seven different languages as a basis for this evaluation. The evaluation is performed as a downstream classification task, where segments of the novels need to be classified by subgenre or period of first publication. The classifier receives different numbers of features identified using different measures of distinctiveness. The main contribution of our paper is that we can show that across a wide variety of parameters, but especially when only a small number of features is used, (more recent) dispersion-based measures very often outperform other (more established) frequency-based measures by significant margins. Our findings support an emerging trend to consider dispersion as an important property of words in addition to frequency. |
Uncontrolled Keywords: | keyness, evaluation, literary texts, distinctiveness |
Status: | Publisher's Version |
URN: | urn:nbn:de:tuda-tuprints-232529 |
Additional Information: | Urspr. Konferenzveröffentlichung/Originally conference publication: 1st Annual Conference of Computational Literary Studies, 01.-02.06.2022, Darmstadt, Germany |
Classification DDC: | 800 Literature > 800 Literature, rhetoric and criticism |
Divisions: | 02 Department of History and Social Science > Institut für Sprach- und Literaturwissenschaft > Digital Philology – Modern German Literary Studies |
Date Deposited: | 21 Feb 2023 10:19 |
Last Modified: | 22 Jul 2024 08:12 |
URI: | https://tuprints.ulb.tu-darmstadt.de/id/eprint/23252 |
PPN: | |
Export: |
View Item |