审查者的审查:通过大型语言模型指导反馈提升同行评审质量
📄 中文摘要
同行评审是科学质量的核心,但对简单启发式的依赖导致标准下降。以往研究将懒惰思维检测视为单标签任务,然而评审段落可能存在多种问题,包括更广泛的清晰度问题或特异性问题。将检测转化为可操作的改进需要遵循指导原则的反馈,而这一点目前尚缺失。研究提出了一种基于大型语言模型的框架,将评审分解为论证段落,通过结合大型语言模型特征与传统分类器的神经符号模块识别问题,并利用经过遗传算法优化的特定问题模板生成针对性反馈。实验结果表明,该方法在零样本大型语言模型基线之上表现更佳。
📄 English Summary
Reviewing the Reviewer: Elevating Peer Review Quality through LLM-Guided Feedback
Peer review is essential for maintaining scientific quality; however, reliance on simplistic heuristics has led to a decline in standards. Previous research has treated lazy thinking detection as a single-label task, yet review segments may exhibit multiple issues, including broader clarity problems and specificity issues. Transforming detection into actionable improvements necessitates guideline-aware feedback, which is currently lacking. This study introduces an LLM-driven framework that decomposes reviews into argumentative segments, identifies issues through a neurosymbolic module that combines LLM features with traditional classifiers, and generates targeted feedback using issue-specific templates refined by a genetic algorithm. Experimental results demonstrate that this method outperforms a zero-shot LLM baseline.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等