TU Darmstadt / ULB / TUprints

Systematic Task Exploration with LLMs: A Study in Citation Text Generation

Şahinuç, Furkan ; Kuznetsov, Ilia ; Hou, Yufang ; Gurevych, Iryna
eds.: Ku, Lun-Wei ; Martins, Andre ; Srikumar, Vivek (2024)
Systematic Task Exploration with LLMs: A Study in Citation Text Generation.
The 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok, Thailand (11.08.2024-16.08.2024)
doi: 10.26083/tuprints-00028922
Conference or Workshop Item, Secondary publication, Publisher's Version

[img] Text
2024.acl-long.265.pdf
Copyright Information: CC BY 4.0 International - Creative Commons, Attribution.

Download (1MB)
Item Type: Conference or Workshop Item
Type of entry: Secondary publication
Title: Systematic Task Exploration with LLMs: A Study in Citation Text Generation
Language: German
Date: 17 December 2024
Place of Publication: Darmstadt
Year of primary publication: August 2024
Place of primary publication: Kerrville, TX, USA
Publisher: ACL
Book Title: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Event Title: The 62nd Annual Meeting of the Association for Computational Linguistics
Event Location: Bangkok, Thailand
Event Dates: 11.08.2024-16.08.2024
DOI: 10.26083/tuprints-00028922
Corresponding Links:
Origin: Secondary publication service
Abstract:

Large language models (LLMs) bring unprecedented flexibility in defining and executing complex, creative natural language generation (NLG) tasks. Yet, this flexibility brings new challenges, as it introduces new degrees of freedom in formulating the task inputs and instructions and in evaluating model performance. To facilitate the exploration of creative NLG tasks, we propose a three-component research framework that consists of systematic input manipulation, reference data, and output measurement. We use this framework to explore citation text generation — a popular scholarly NLP task that lacks consensus on the task definition and evaluation metric and has not yet been tackled within the LLM paradigm. Our results highlight the importance of systematically investigating both task instruction and input configuration when prompting LLMs, and reveal non-trivial relationships between different evaluation metrics used for citation text generation. Additional human generation and human evaluation experiments provide new qualitative insights into the task to guide future research in citation text generation. We make our code and data publicly available.

Status: Publisher's Version
URN: urn:nbn:de:tuda-tuprints-289229
Classification DDC: 000 Generalities, computers, information > 004 Computer science
Divisions: 20 Department of Computer Science > Ubiquitous Knowledge Processing
Zentrale Einrichtungen > hessian.AI - The Hessian Center for Artificial Intelligence
Date Deposited: 17 Dec 2024 16:35
Last Modified: 17 Dec 2024 16:35
URI: https://tuprints.ulb.tu-darmstadt.de/id/eprint/28922
PPN:
Export:
Actions (login required)
View Item View Item