GPT-5.4 终结了专业模型

出处: GPT-5.4 Killed the Specialist Model

发布: 2026年3月29日

📄 中文摘要

GPT-5.4于3月5日发布,彻底改变了构建AI代理的方式。过去一年中,开发者需要在不同模型之间进行切换,将编码任务分配给Codex,推理任务交给思维模型,视觉任务交给多模态模型,且需小心避免将SQL查询发送到诗歌模型。然而,GPT-5.4的发布使得这一复杂的模型组合模式不再必要。该模型在多个任务上表现出色,包括在SWE-bench Pro上获得57.7%的评分,在OSWorld上达到75%,超越了72.4%的人工专家基线,并在GDPval上获得83%的成绩。此外,它通过API支持1M的上下文窗口,标志着其在编码和桌面自动化等领域的前沿水平。

📄 English Summary

GPT-5.4 Killed the Specialist Model

Released on March 5, GPT-5.4 has transformed the way AI agents are built. Over the past year, developers had to juggle multiple models, routing coding tasks to Codex, reasoning tasks to a thinking model, and visual tasks to a multimodal model, all while avoiding misrouting SQL queries to the poetry model. However, the introduction of GPT-5.4 eliminates the need for this complex model juggling. The model has demonstrated impressive performance across various tasks, achieving 57.7% on SWE-bench Pro, 75% on OSWorld (surpassing the 72.4% human expert baseline), and 83% on GDPval. Additionally, it supports a 1M token context window via API, marking a significant advancement in coding and desktop automation capabilities.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等