📄 中文摘要
对于每月处理低于10亿个令牌的情况,使用API是最优选择,无论是专有模型还是托管的开源模型都差别不大。在处理1到10亿个令牌时,来自Together.ai或Groq的托管开源API通常是最便宜的选择。而当处理超过10亿个令牌时,自我托管可能会更具优势,但前提是组织已有MLOps团队。需要注意的是,'开源是免费的'这一说法忽视了每年30万到60万美元的工程开销。整体来看,选择合适的模型和部署方式需综合考虑成本和团队能力。
📄 English Summary
Open Source vs Proprietary LLMs: The Real Cost Breakdown
For handling fewer than 1 billion tokens per month, using APIs is the optimal choice, with little difference between proprietary models and hosted open-source models. For processing between 1 and 10 billion tokens, hosted open-source APIs from Together.ai or Groq are typically the cheapest options. When dealing with over 10 billion tokens per month, self-hosting can be advantageous, but only if an MLOps team is already in place. It's important to note that the narrative of 'open source is free' overlooks the engineering overhead of $300K to $600K per year. Overall, selecting the appropriate model and deployment method requires a comprehensive consideration of costs and team capabilities.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等