Qwen3.6 27B A4B maker logoQwen3.6 27B A4B - NVIDIA RTX 5090

qwen/qwen3.6-27b
Deploy
qwen3.6:27bReleased Apr 27, 2026159K context
Best forCodeAgentic Coding
Best fit competitor·Claude 3.5 Sonnet

Why teams pick Qwen3.6 27B A4B over Claude 3.5 Sonnet

Built for agentic coding and repo-level reasoning — the same workloads teams reach for Sonnet on, without sending code to a third-party API.

262K native context (YaRN to ~1M) on hardware you control, with Thinking Preservation across long sessions.

Apache 2.0 and self-hosted: fine-tune, air-gap, and ship features without Anthropic rate limits or data-handling policies.

About

Qwen3.6 27B A4B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It features hybrid multimodal capabilities — accepting text, image, and video inputs — and supports a 262,144-token context window.

The model is designed for agentic coding and reasoning tasks, with particular strength in repository-level code comprehension, front-end development workflows, and multi-step problem solving. It includes a built-in thinking mode for extended reasoning and preserves thinking context across conversation history. Qwen3.6 27B A4B supports 201 languages and dialects and is released under the Apache 2.0 license.

Compare

Model Cost Across Durations

Live pack pricing vs typical API estimates from 11 hours through 1 month.

API estimates for GPT-5.4 and Claude Opus 4.6 vs Qwen 3.6 27B A4B on RTX 5090.

Time pack

Qwen 3.6 27B A4B on RTX 5090

24 hours cost

$31

Lowest

GPT-5.4

24 hours cost

$35.47

Save $4.47 vs our model

Claude Opus 4.6

24 hours cost

$63.07

Save $32.07 vs our model

Models in chart

  • Qwen 3.6 27B A4B on RTX 5090
  • GPT-5.4
  • Claude Opus 4.6

At a glance

Release
Apr 27, 2026
Parameters
27.8 B (reported)
Quantization
Q4_K_M
Size
17GB
Context
159K

Benchmarks

Snapshot of third-party and official benchmark metrics for Qwen3.6 27B A4B.

Performance indexes

46
Artificial Analysis
Intelligence Index
#1 among open-weight small models (4B–40B)
77.2
SWE-bench Verified
Coding Index
Strong real-world GitHub issue resolution
59.3
Terminal-Bench 2.0
Agentic Index
Agentic terminal + tool use

Benchmark scores

GPQA Diamond
i
Graduate-level scientific reasoning
87.8%
HLE
i
Humanity's Last Exam
24.0%
IFBench
i
Instruction-following benchmark
83.9%

What it's good at

Flagship-level agentic coding and terminal use — outperforms the 397B Qwen3.5 MoE on every major coding benchmark.

Repository-level code comprehension and multi-step problem solving across long contexts.

Front-end development: QwenWebBench covers Web Design, Web Apps, Games, SVG, Data Visualization, Animation, and 3D (bilingual EN/CN).

Extended reasoning via built-in Thinking Preservation mode — preserves chain-of-thought across conversation turns.

Strong multilingual support: 201 languages and dialects.

Efficient local deployment: runs on ~18GB VRAM (Q4 quantized: ~16.8GB); dense architecture compresses more predictably than MoE.

Apps & integrations

Choose an app below. Each guide shows how to point the app at your OpenAI-compatible endpoint.

FAQ

Frequently asked questions

Common questions about Qwen3.6 27B A4B, deployment, and using it on OpenLLM Buddy.

6 questions

Ready to try it? Deploy Qwen3.6 27B A4B · Browse models