Providing constructive feedback to paper authors is a core component of peer review. With reviewers increasingly having less time to perform reviews, automated support systems are required to ensure high reviewing quality, thus making the feedback in reviews useful for authors. To this end, we identify four key aspects of review comments (individual points in weakness sections of reviews) that drive the utility for authors: Actionability, Grounding & Specificity, Verifiability, and Helpfulness. To enable evaluation and development of models assessing review comments, we introduce the RevUtil dataset. We collect 1,430 human-labeled review comments and scale our data with 10k synthetically labeled comments for training purposes. The synthetic data additionally contains rationales, i.e., explanations for the aspect score of a review comment. Employing the RevUtil dataset, we benchmark fine-tuned models for assessing review comments on these aspects and generating rationales. Our experiments demonstrate that these fine-tuned models achieve agreement levels with humans comparable to, and in some cases exceeding, those of powerful closed models like GPT-4o. Our analysis further reveals that machine-generated reviews generally underperform human reviews on our four aspects.

Sprache

Englisch

Fachbereich/-gebiet

20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung

DDC

000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik

Institution

Universitäts- und Landesbibliothek Darmstadt

Ort

Darmstadt

Veranstaltungstitel

2025 Conference on Empirical Methods in Natural Language Processing

Veranstaltungsort

Suzhou, China

Startdatum der Veranstaltung

04.11.2025

Enddatum der Veranstaltung

09.11.2025

Buchtitel

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Startseite

28980

Endseite

29010

ISBN

979-8-89176-332-6

Verlag

Association for Computational Linguistics

Publikationsjahr der Erstveröffentlichung

08.11.2025

Verlags-DOI

10.18653/v1/2025.emnlp-main.1476

PPN

542593270

Zusätzliche Infomationen

This work has been funded by the LOEWE Distinguished Chair “Ubiquitous Knowledge Processing”, LOEWE initiative, Hesse, Germany (Grant Number: LOEWE/4a//519/05/00.002(0002)/81), by the European Union (ERC, InterText, 101054961) and by the German Research Foundation (DFG) as part of the PEER project (grant GU 798/28-1).

...ist identisch zu Verlagsversion

https://aclanthology.org/2025.emnlp-main.1476

...ist Teil von

https://doi.org/10.18653/v1/2025.emnlp-main

Ergänzende Ressourcen (Supplement)

https://github.com/bodasadallah/RevUtil