DesignSense:用于图形布局生成的人类偏好数据集和奖励建模框架

📄 中文摘要

图形布局作为一种重要的视觉传播媒介,在不同渠道中发挥着关键作用。尽管近期的布局生成模型展现了令人印象深刻的能力,但它们往往无法与细致的人类审美判断相一致。现有的偏好数据集和基于文本生成的奖励模型在布局评估中并不适用,因为相同元素的空间排列决定了布局的质量。为了解决这一关键问题,DesignSense-10k被提出,作为一个包含10,235个人工标注偏好对的大规模数据集,用于图形布局评估。该研究提出了一个五阶段的策划流程,生成在多种纵横比下视觉上连贯的布局变换,利用语义信息进行优化。

📄 English Summary

DesignSense: A Human Preference Dataset and Reward Modeling Framework for Graphic Layout Generation

Graphic layouts play a crucial role as a medium for visual communication across various channels. Recent layout generation models have shown impressive capabilities; however, they often fail to align with nuanced human aesthetic judgments. Existing preference datasets and reward models trained for text-to-image generation do not generalize well to layout evaluation, where the spatial arrangement of identical elements determines quality. To address this critical gap, DesignSense-10k is introduced, comprising a large-scale dataset of 10,235 human-annotated preference pairs for graphic layout evaluation. A five-stage curation pipeline is proposed to generate visually coherent layout transformations across diverse aspect ratios, utilizing semantic information for optimization.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等