使用视觉-语言模型进行图表去渲染、推理和修复

📄 中文摘要

数据可视化在科学传播、新闻报道和日常决策中至关重要,但常常存在错误,可能扭曲解读或误导观众。基于规则的可视化检查工具能够标记违规行为,但缺乏上下文,且无法提供有意义的设计改进建议。直接询问通用大语言模型关于可视化质量的反馈往往不可靠,因为它们缺乏遵循可视化设计原则的训练,结果常常产生不一致或错误的反馈。该研究提出了一种框架,结合图表去渲染、自动化分析和迭代改进,提供可操作且可解释的可视化设计反馈。系统重构图表结构,旨在提升可视化的质量和有效性。

📄 English Summary

De-rendering, Reasoning, and Repairing Charts with Vision-Language Models

Data visualizations play a crucial role in scientific communication, journalism, and everyday decision-making, yet they are often subject to errors that can distort interpretation or mislead audiences. Rule-based visualization linters can identify violations but lack context and do not offer meaningful design suggestions. Querying general-purpose LLMs about visualization quality is often unreliable, as they are not trained to adhere to visualization design principles, leading to inconsistent or incorrect feedback. This research introduces a framework that integrates chart de-rendering, automated analysis, and iterative improvement to provide actionable and interpretable feedback on visualization design. The system reconstructs the structure of a chart to enhance the quality and effectiveness of visualizations.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等