在 Android 上使用 MNN 完全离线运行 LLM + RAG(无云,无 API)
📄 中文摘要
大多数 AI 应用程序依赖于云服务,用户需要上传文档、调用 API 并等待响应,这在网络不稳定或不可用时会导致 AI 停止工作。为了解决这一问题,研究者探索了在移动设备上完全离线运行 LLM + RAG 管道的可能性。经过数月的实验和优化,最终成功实现了一个完全在设备上运行的离线文档 AI,能够在中档 Android 设备上使用,并且不依赖外部 API,确保文档的本地存储。该项目的成功为未来的移动 AI 应用提供了新的思路。
📄 English Summary
How I ran LLM + RAG fully offline on Android using MNN
Most AI applications today rely heavily on cloud services, requiring users to upload documents, call APIs, and wait for responses, which can lead to failures when internet connectivity is slow or unavailable. This project explores the feasibility of running a complete LLM + RAG pipeline fully offline on a mobile device. After months of experimentation and optimization, a fully offline document AI was successfully implemented, capable of running on mid-range Android devices without relying on external APIs, ensuring local document storage. This success offers new insights for future mobile AI applications.
Powered by Cloudflare Workers + Payload CMS + Claude 3.5
数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等