📄 中文摘要
大多数关于大型语言模型(LLM)成本优化的建议集中在每次调用的费用上,但真正的成本驱动因素是不必要的调用。通过跟踪哪些LLM输出产生了有价值的结果,发现35%的输出从未被用户阅读,另有15%被阅读后立即被忽视,只有50%的输出促成了实际的用户行为。这表明,50%的LLM成本并未产生任何价值。为了解决这个问题,建议增加一个简单的检查,确认用户是否对输出内容感兴趣,从而有效降低LLM的使用成本。
📄 English Summary
The Single Best Way to Reduce LLM Costs (It Is Not What You Think)
Most advice on optimizing costs for large language models (LLMs) focuses on the cost per call, but the real cost driver is unnecessary calls. By tracking which LLM outputs are valuable, it was found that 35% of outputs were never read by users, another 15% were read but immediately dismissed, and only 50% led to actual user actions. This indicates that 50% of LLM costs yield no value. To address this, a simple check should be added to determine if users engage with the outputs, effectively reducing LLM usage costs.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等