GPT-5.4:原生计算机使用,83%专业表现

📄 中文摘要

OpenAI 于2026年3月5日发布的 GPT-5.4 是首个内置原生计算机使用功能的通用模型,在真实工作比较中,其表现与专业人士相匹配或超越达83%,相比于前一代 GPT-5.2 的70.9% 有显著提升。该模型能够在无需额外插件的情况下操作应用程序、浏览器和工作流程。其1百万个令牌的上下文窗口支持真实的长时间代理任务,避免了分块处理。此外,GPT-5.4 在事实错误方面减少了33%,这一点在内部和第三方基准测试中得到了验证。

📄 English Summary

GPT-5.4: Native Computer Use, 83% Pro Performance

OpenAI's GPT-5.4, released on March 5, 2026, is the first general-purpose model with built-in native computer use, achieving or exceeding professional performance in 83% of real-world comparisons, a significant increase from 70.9% in the previous generation, GPT-5.2. This model can operate applications, browsers, and workflows without needing separate plugins. Its 1 million token context window allows for genuine long-horizon agentic tasks without chunking. Additionally, GPT-5.4 has shown a 33% reduction in factual errors compared to GPT-5.2, as verified by both internal and third-party benchmarks.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等