我们构建了一项服务，在用户发现之前捕捉 LLM 漂移

出处: We Built a Service That Catches LLM Drift Before Your Users Do

发布: 2026年3月12日

📄 中文摘要

在发布了基于 LLM 的功能后，初期测试表现良好，用户反馈积极。然而，几周后，支持邮箱却涌入了用户的投诉，输出结果错误，应用解析的 JSON 格式不正确，分类器的回答也出现了不一致。这种现象被称为 LLM 漂移，开发者往往在用户反馈后才意识到这一问题。2025 年 2 月，r/LLMDevs 的开发者们报告称，GPT-4o 在没有任何提前通知的情况下改变了行为，导致输出结果显著变化。这种情况不仅发生在 OpenAI，Claude、Gemini 以及一些“冻结”的模型版本也会意外改变行为，给开发者带来困扰。

🏷️ 相关标签

#LLM 漂移 #用户反馈 #开发者 #模型变化

📄 English Summary

We Built a Service That Catches LLM Drift Before Your Users Do

After launching an LLM-powered feature that performed well in testing and received positive feedback during beta, developers faced a surge of complaints in their support inbox just three weeks later. Users reported incorrect outputs, issues with JSON parsing, and inconsistent classifier responses, indicating that the LLM had drifted without prior notice. This phenomenon of LLM drift is more common than anticipated. In February 2025, developers on r/LLMDevs reported that GPT-4o changed its behavior unexpectedly, significantly altering prompt outputs without any advance warning. This issue is not limited to OpenAI; models like Claude, Gemini, and even supposedly frozen versions can also exhibit unexpected behavior changes, causing significant challenges for developers.

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等

📄 中文摘要

🏷️ 相关标签

📄 English Summary

We Built a Service That Catches LLM Drift Before Your Users Do

🏷️ Related Tags

📚 相关文章

AI 编程创造了新一类创作者。我就是其中之一。

人工智能成为我学习的助手

Claude CLI "泄露": 没有人赢，AI 仍然幻觉，企业仍在犯同样的错误