当人工智能不断遗忘时:大型语言模型工作流程崩溃的原因及替代方案
📄 中文摘要
在构建一个基于ChatGPT和Claude的职业智能项目六个月后,作者发现了模型在记忆和术语一致性方面的问题。定义明确的术语在不同会话中出现漂移,例如“职业智能框架”在一次会话中变成“职业智能系统”,而在另一次会话中又变为“CI框架”。先前做出的决策在后续会话中变成了悬而未决的问题,导致作者需要反复解释同一概念。模型对参考的理解也变得模糊,导致同一指令在不同会话中可能指向不同的文件。经过分析,发现问题并非配置错误,而是由于大型语言模型的令牌窗口、记忆限制和架构导致的术语跟踪丧失。
📄 English Summary
When an AI Keeps Forgetting: Why LLM Workflows Collapse and What to Build Instead
After six months of developing a career intelligence project using ChatGPT and Claude, the author observed significant issues with memory and terminology consistency. Precise definitions began to drift; for instance, 'Career Intelligence Framework' transformed into 'Career Intel System' in one session and 'CI Framework' in another. Decisions made weeks prior resurfaced as open questions, requiring the author to repeatedly explain concepts. The model's understanding of references became vague, leading to the same command potentially pointing to different files in various sessions. Upon investigation, it became clear that the problem was not a misconfiguration but rather a result of token window limitations, memory constraints, and the architecture of large language models that hindered consistent terminology tracking.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等