没有 GPU?没有强大的机器?在云端免费运行 Ollama

📄 中文摘要

利用 Google Colab 和 Ollama 设置,可以在云端运行大型语言模型(LLMs),并随时随地访问。通过将 Google Colab 转变为个人云 LLM 服务器,用户可以轻松连接本地机器。该方法的特点包括:使用 Google Colab 提供的免费 GPU 运行 Ollama,公开访问以便将 Ollama 服务器暴露于互联网,支持 128K 上下文长度以处理大规模提示,后台执行在 tmux 会话中进行,使用 SSH 隧道(Pinggy)实现即时公共 URL 访问,以及简单的设置过程,只需逐步运行笔记本单元即可。整个过程包括安装依赖项和 Ollama,后台启动 Ollama 服务器,通过 SSH 隧道暴露并获取公共 URL,最终实现从任何地方的连接。

📄 English Summary

No GPU? No Powerful Machine? Run Ollama in the Cloud for FREE

The setup using Google Colab and Ollama allows users to run large language models (LLMs) in the cloud and access them from anywhere. By transforming Google Colab into a personal cloud LLM server, users can easily connect from their local machines. Key features include running Ollama with free GPU from Google Colab, public access to expose the Ollama server to the internet, support for 128K context length to handle massive prompts, background execution within a tmux session, instant public URL access via an SSH tunnel (Pinggy), and a simple setup process that involves running notebook cells step-by-step. The process consists of installing dependencies and Ollama, starting the Ollama server in the background, exposing it via an SSH tunnel to obtain a public URL, and connecting from anywhere.

Powered by Cloudflare Workers + Payload CMS + Claude 3.5

数据源: OpenAI, Google AI, DeepMind, AWS ML Blog, HuggingFace 等