Qwen3.6-Plus Launch: Alibaba's Most Powerful Coding Agent Model

Key Highlights

Top Coding Agent: Terminal-Bench 2.0 at 61.6, surpassing Claude Opus 4.5 (59.3), SWE-bench Verified 78.8, world-class coding Agent capabilities
Efficient MoE Architecture: 72B total params / 8 experts / 2 active, effective compute ~18B, roughly 3x faster than Claude Opus 4.6
Million-Token Context: 1M token context window, process ~750K words or an entire large codebase in a single request
Always-On CoT: Always-on chain-of-thought reasoning + native function calling, purpose-built for Agent workflows
Native Multimodal: Trained on native multimodal data, supports generating frontend code from screenshots and design mockups

Background

On April 2, 2026, Alibaba’s Qwen team officially released Qwen3.6-Plus, the first model in the Qwen3.6 series. It has been called “China’s most powerful coding model.” This represents a major upgrade to the MoE (Mixture of Experts) architecture, delivering top-tier coding and Agent capabilities while maintaining excellent inference efficiency — directly competing with Claude Opus 4.5. Qwen3.6-Plus marks a milestone for Chinese AI models entering the global top tier in coding Agent capabilities. It surpassed Claude Opus 4.5 on Terminal-Bench 2.0 and broke daily call volume records on OpenRouter shortly after launch. APIYI now offers qwen3.6-plus with OpenAI-compatible API access.

Detailed Analysis

Core Features

Top Coding Agent

Terminal-Bench 61.6 surpasses Claude Opus 4.5, SWE-bench 78.8, autonomously decomposes tasks, plans execution paths, and iterates until completion

Efficient MoE Architecture

72B total / 8 expert modules / 2 active per inference, ~18B effective compute, fast and cost-efficient

Million-Token Context

1M token window, process entire large codebases or ~750K words of text in a single pass

Native Multimodal

Trained on native multimodal data, generates frontend pages from screenshots and design mockups

Benchmark Performance

Category	Benchmark	Qwen3.6-Plus	Comparison
Coding	Terminal-Bench 2.0	61.6	Claude Opus 4.5: 59.3
Coding	SWE-bench Verified	78.8	Claude Opus 4.5: 80.9
Document	OmniDocBench v1.5	91.2	Global #1
Real-World QA	RealWorldQA	85.4	Leads mainstream models
Web Agent	QwenWebBench Elo	1502	Just behind Gemini 3 Pro

Data sources: Alibaba Qwen official blog (qwen.ai/blog), OpenRouter benchmark data. Qwen3.6-Plus was officially released on April 2, 2026.

vs Competitors:

vs Claude Opus 4.5: Terminal-Bench ahead (61.6 vs 59.3), SWE-bench close (78.8 vs 80.9)
vs Gemini 3 Pro: OmniDocBench leads, QwenWebBench close
Inference speed: Community tests show ~3x faster than Claude Opus 4.6

Technical Specifications

Parameter	Qwen3.6-Plus
Architecture	MoE (Mixture of Experts)
Total Parameters	72B
Expert Count	8 (2 active per inference)
Effective Compute	~18B
Context Window	1,000,000 tokens
Max Output	65,536 tokens
Chain-of-Thought	Always-on CoT
Function Calling	Native support
Multimodal	Text + Image input
Model Name	`qwen3.6-plus`

Practical Applications

Recommended Use Cases

Coding Agent

Autonomous code repair, repo-level refactoring, complex terminal operations — ideal for Claude Code / Cursor-like coding assistants

Long Document Processing

1M token context for analyzing entire codebases or lengthy documents, OmniDocBench global #1

Frontend Development

Auto-generate frontend pages from design mockups or screenshots, with interactive editing and code completion

Multi-Step Agents

Always-on CoT + native function calling, perfect for complex workflow orchestration and multi-step task execution

Code Examples

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",
    base_url="https://api.apiyi.com/v1"
)

response = client.chat.completions.create(
    model="qwen3.6-plus",
    messages=[
        {"role": "system", "content": "You are a professional coding assistant."},
        {"role": "user", "content": "Write a Python LRU cache implementation with thread-safe concurrent access."}
    ],
    max_tokens=8192
)

print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-api-key",
  baseURL: "https://api.apiyi.com/v1",
});

const response = await client.chat.completions.create({
  model: "qwen3.6-plus",
  messages: [
    { role: "user", content: "Review this code for potential security issues and suggest fixes." }
  ],
  max_tokens: 8192,
});

console.log(response.choices[0].message.content);

Best Practices

Qwen3.6-Plus always enables chain-of-thought reasoning, so output may include reasoning steps. If you only need the final answer, specify “Give the result directly without reasoning process” in the system prompt.

Coding: Leverage the million-token context window by including your entire project as context for more precise code suggestions
Agent workflows: Take full advantage of native function calling — define clear tool descriptions and let the model plan execution autonomously
Long documents: With OmniDocBench 91.2 (global #1), Qwen3.6-Plus excels at document understanding and information extraction

Pricing & Availability

Pricing

Item	Price
Input	See APIYI console
Output	See APIYI console

Deposit Bonus

Current deposit bonus promotion is ongoing — the more you deposit, the bigger the bonus. See Deposit Promotions for details.

Summary & Recommendations

Qwen3.6-Plus is currently the most powerful Chinese coding Agent model. With Terminal-Bench 61.6 surpassing Claude Opus 4.5 and an efficient MoE architecture delivering excellent speed and cost advantages, the combination of 1M token context + always-on CoT + native function calling makes it ideal for coding assistants, Agent workflows, and long document processing. Recommended for:

Developers needing high-quality coding assistants
Teams building Agent workflows
Users processing very long documents or large codebases
Those seeking cost-effective alternatives to premium models

Sources: Alibaba Qwen official blog (qwen.ai/blog), ChinaNews, IT Home. Data retrieved: April 6, 2026. The Qwen3.6 series will subsequently release the more powerful Qwen3.6-Max flagship model and open-source models of various sizes.

GLM-5.1 Launch: Zhipu's Most Powerful Open-Source Coding Agent Gemini Embedding 2 Preview: First Natively Multimodal Embedding Model

​Key Highlights

​Background

​Detailed Analysis

​Core Features

Top Coding Agent

Efficient MoE Architecture

Million-Token Context

Native Multimodal

​Benchmark Performance

​Technical Specifications

​Practical Applications

​Recommended Use Cases

Coding Agent

Long Document Processing

Frontend Development

Multi-Step Agents

​Code Examples

​Best Practices

​Pricing & Availability

​Pricing

​Deposit Bonus

​Summary & Recommendations

Key Highlights

Background

Detailed Analysis

Core Features

Benchmark Performance

Technical Specifications

Practical Applications

Recommended Use Cases

Code Examples

Best Practices

Pricing & Availability

Pricing

Deposit Bonus

Summary & Recommendations