Logo des Repositoriums
  • English
  • Deutsch
Anmelden
Keine TU-ID? Klicken Sie hier für mehr Informationen.
  1. Startseite
  2. Publikationen
  3. Publikationen der Technischen Universität Darmstadt
  4. Erstveröffentlichungen
  5. BookNLP-fr, the French Versant of BookNLP. A Tailored Pipeline for 19th and 20th Century French Literature
 
  • Details
2024
Erstveröffentlichung
Preprint

BookNLP-fr, the French Versant of BookNLP. A Tailored Pipeline for 19th and 20th Century French Literature

File(s)
Download
Hauptpublikation
3924_BookNLPfr_Conference_Version.pdf
CC BY 4.0 International
Format: Adobe PDF
Size: 1.94 MB
TUDa URI
tuda/11861
URN
urn:nbn:de:tuda-tuprints-273969
DOI
10.26083/tuprints-00027396
Autor:innen
Mélanie-Becquet, Frédérique ORCID 0000-0002-8216-9437
Barré, Jean ORCID 0000-0002-1579-0610
Seminck, Olga ORCID 0000-0003-4617-5992
Plancq, Clément ORCID 0000-0001-6189-0703
Naguib, Marco ORCID 0009-0003-2950-8852
Pastor, Martial ORCID 0009-0004-8128-5497
Poibeau, Thierry ORCID 0000-0003-3669-4051
Kurzbeschreibung (Abstract)

This paper presents BookNLP-fr: the adaptation to French of BookNLP, an existing NLP pipeline tailored for literary texts in English. We provide an overview of the challenges involved in the adaptation of such a pipeline to a new language: from the challenges related to data annotation up to the development of specialized modules of entity recognition and coreference. Moving beyond the technical aspects, we explore practical applications of BookNLP-fr with a canonical task for computational literary studies: subgenre classification. We show that BookNLP-fr provides more relevant and – even more importantly – more interpretable features to perform automatic subgenre classification than the traditional bag-of-words approach. BookNLP-fr makes NLP techniques available to a larger public and constitutes a new toolkit to process large numbers of digitized books in French. This allows the field to gain a deeper literary understanding through the practice of distant reading.

Freie Schlagworte

Natural Language Proc...

Computational Literar...

French Literature

Coreference Resolutio...

Entity Recognition

Subgenre Classificati...

Sprache
Englisch
Fachbereich/-gebiet
02 Fachbereich Gesellschafts- und Geschichtswissenschaften > Institut für Sprach- und Literaturwissenschaft > Digital Philology - Neuere deutsche Literaturwissenschaft
DDC
800 Literatur > 800 Literatur, Rhetorik, Literaturwissenschaft
Institution
Universitäts- und Landesbibliothek Darmstadt
Ort
Darmstadt
Titel der Reihe
CCLS2024 Conference Preprints
Bandnummer der Reihe
3
Heftnummer der Zeitschrift
1
PPN
518965619
Zusätzliche Infomationen
This paper has been submitted to the conference track of JCLS. It has been peer reviewed and accepted for presentation and discussion at the 3rd Annual Conference of Computational Literary Studies at Vienna, Austria, in June 2024.
Zusätzliche Links (Organisation)
https://jcls.io/site/ccls2024/

  • TUprints Leitlinien
  • Cookie-Einstellungen
  • Impressum
  • Datenschutzbestimmungen
  • Webseitenanalyse
Diese Webseite wird von der Universitäts- und Landesbibliothek Darmstadt (ULB) betrieben.