# OpenLLM Buddy

> Flat‑rate GPU hosting for Gemma/Qwen/Open Source LLMS. OpenAI‑compatible API. No DevOps. Pay monthly. Deploy in 3 clicks.

OpenLLM Buddy is a managed open-source LLM deployment platform. Customers pick a model template, purchase GPU time packs, and receive an OpenAI-compatible API endpoint with API key access. This file helps LLM agents find authoritative pages on the public site.

- Canonical site: https://openllmbuddy.cloud
- Sitemap: https://openllmbuddy.cloud/sitemap.xml
- Robots: https://openllmbuddy.cloud/robots.txt

## Product

- [Home](https://openllmbuddy.cloud/): Agent Buddy for Hermes & OpenClaw — all tasks, 10M free tokens, $25/mo unlimited
- [Agent Buddy pricing](https://openllmbuddy.cloud/#pricing): Free: all tasks + 10M tokens; Unlimited: $25/mo all tasks
- [Self Deploy LLMs](https://openllmbuddy.cloud/selfdeploy/self-deploy): OpenLLM catalog, templates, deploy flow, and pricing
- [Included agent tasks](https://openllmbuddy.cloud/#browse): Full task catalog included in every plan
- [How it works](https://openllmbuddy.cloud/selfdeploy/self-deploy/how-it-works): Deploy flow, time packs, and API access
- [Models](https://openllmbuddy.cloud/selfdeploy/self-deploy/models): Open-source model catalog, benchmarks, and pricing
- [API documentation](https://openllmbuddy.cloud/selfdeploy/self-deploy/api-docs): OpenAI-compatible chat completions reference
- [Integrations](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations): Setup guides for agents, IDEs, and devices

## Models

- [Gemma 4 26B A4B](https://openllmbuddy.cloud/selfdeploy/self-deploy/models/gemma-4-26b): handle `gemma4:26b` — Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B ac
- [Qwen3.6 27B A4B](https://openllmbuddy.cloud/selfdeploy/self-deploy/models/qwen-3.6-27b): handle `qwen3.6:27b` — Qwen3.6 27B A4B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It features hybrid mul

## Integrations

- [n8n](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/n8n): Automate workflows and call your model as a node.
- [OpenClaw](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/openclaw): Build AI agents and tools on an OpenAI-compatible endpoint.
- [Hermes](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/hermes): Connect agent runners to your chat completions endpoint.
- [OpenCode](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/opencode): Power developer tools with your OpenAI-compatible model.
- [Cursor](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/cursor): Override OpenAI Base URL in Cursor Settings and use your model with BYOK.
- [VS Code](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/vscode): Use the Cline extension in VS Code to connect your OpenAI-compatible endpoint.
- [Codex](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/codex): Run OpenAI Codex CLI against your Chat Completions endpoint via config.toml.
- [Raspberry Pi](https://openllmbuddy.cloud/selfdeploy/self-deploy/integrations/raspberry-pi): Full Pi OS guide: SSH, API keys, curl, Python venv, systemd, and troubleshooting.

## Optional

- [Blog](https://openllmbuddy.cloud/blog): Articles on deployments, APIs, and GPU inference
- [Support](https://openllmbuddy.cloud/support): Help with billing, deployments, and API keys
- [Privacy](https://openllmbuddy.cloud/privacy): Privacy policy