# OpenLLM Buddy > Flat‑rate GPU hosting for Gemma/Qwen/Open Source LLMS. OpenAI‑compatible API. No DevOps. Pay monthly. Deploy in 3 clicks. OpenLLM Buddy is a managed open-source LLM deployment platform. Customers pick a model template, purchase GPU time packs, and receive an OpenAI-compatible API endpoint with API key access. This file helps LLM agents find authoritative pages on the public site. - Canonical site: https://openllmbuddy.cloud - Sitemap: https://openllmbuddy.cloud/sitemap.xml - Robots: https://openllmbuddy.cloud/robots.txt ## Product - [Home](https://openllmbuddy.cloud/): Agent Buddy for Hermes & OpenClaw — all tasks, 10M free tokens, $25/mo unlimited - [Agent Buddy pricing](https://openllmbuddy.cloud/#pricing): Free: all tasks + 10M tokens; Unlimited: $25/mo all tasks - [Self Deploy LLMs](https://openllmbuddy.cloud/selfdeploy/self-deploy): OpenLLM catalog, templates, deploy flow, and pricing - [Included agent tasks](https://openllmbuddy.cloud/#browse): Full task catalog included in every plan - [How it works](https://openllmbuddy.cloud/selfdeploy/self-deploy/how-it-works): Deploy flow, time packs, and API access - [Models](https://openllmbuddy.cloud/selfdeploy/self-deploy/models): Open-source model catalog, benchmarks, and pricing - [API documentation](https://openllmbuddy.cloud/selfdeploy/self-deploy/api-docs): OpenAI-compatible chat completions reference - [Integrations](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations): Setup guides for agents, IDEs, and devices ## Models - [Gemma 4 26B A4B](https://openllmbuddy.cloud/selfdeploy/self-deploy/models/gemma-4-26b): handle `gemma4:26b` — Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B ac - [Qwen3.6 27B A4B](https://openllmbuddy.cloud/selfdeploy/self-deploy/models/qwen-3.6-27b): handle `qwen3.6:27b` — Qwen3.6 27B A4B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It features hybrid mul ## Integrations - [n8n](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/n8n): Automate workflows and call your model as a node. - [OpenClaw](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/openclaw): Build AI agents and tools on an OpenAI-compatible endpoint. - [Hermes](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/hermes): Connect agent runners to your chat completions endpoint. - [OpenCode](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/opencode): Power developer tools with your OpenAI-compatible model. - [Cursor](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/cursor): Override OpenAI Base URL in Cursor Settings and use your model with BYOK. - [VS Code](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/vscode): Use the Cline extension in VS Code to connect your OpenAI-compatible endpoint. - [Codex](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/codex): Run OpenAI Codex CLI against your Chat Completions endpoint via config.toml. - [Raspberry Pi](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/raspberry-pi): Full Pi OS guide: SSH, API keys, curl, Python venv, systemd, and troubleshooting. ## Optional - [Blog](https://openllmbuddy.cloud/blog): Articles on deployments, APIs, and GPU inference - [Support](https://openllmbuddy.cloud/support): Help with billing, deployments, and API keys - [Privacy](https://openllmbuddy.cloud/privacy): Privacy policy