UI-Venus-1.5 技术报告

出处: UI-Venus-1.5 Technical Report

发布: 2026年2月11日

📄 中文摘要

UI-Venus-1.5 是一种统一的端到端图形用户界面代理,旨在实现稳健的现实应用。该模型家族包括两种密集变体(2B 和 8B)以及一种专家混合变体(30B-A3B),以满足不同下游应用场景的需求。与之前的版本相比,UI-Venus-1.5 引入了三项关键技术进展:首先,全面的中期训练阶段,利用 100 亿个标记和 30 多个数据集建立基础的图形用户界面语义;其次,采用全轨迹回报的在线强化学习方法,以增强模型的学习能力;最后,优化了模型的推理速度和准确性,提升了在复杂任务中的表现。该研究为未来的图形用户界面自动化提供了重要的技术基础。

📄 English Summary

UI-Venus-1.5 Technical Report

UI-Venus-1.5 is a unified, end-to-end GUI agent designed for robust real-world applications. The model family includes two dense variants (2B and 8B) and one mixture-of-experts variant (30B-A3B) to cater to various downstream application scenarios. Compared to its predecessor, UI-Venus-1.5 introduces three key technical advancements: first, a comprehensive Mid-Training stage leveraging 10 billion tokens across 30+ datasets to establish foundational GUI semantics; second, the implementation of Online Reinforcement Learning with full-trajectory rollouts to enhance the model's learning capabilities; and third, optimizations in inference speed and accuracy, improving performance on complex tasks. This research lays a significant technical foundation for future automation in graphical user interfaces.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等