Key Takeaways
- New open-weight flagship:
MiniMax-M3is now live on APIYI (mind the capitalization) — the first open-weight model to combine frontier coding-agent performance, a 1M-token context window, and native multimodality - Beats closed flagships on coding: SWE-Bench Pro 59.0, ahead of GPT-5.5 and Gemini 3.1 Pro; Terminal-Bench 2.1 at 66.0, MCP Atlas at 74.2
- Tops autonomous browsing: BrowseComp 83.5, above Claude Opus 4.7 (79.3); first place on the Claw-Eval end-to-end agent benchmark
- MSA sparse attention: MiniMax Sparse Attention replaces full attention with KV-block selection — 1M-context inference at roughly 1/20 the cost of the previous generation
- Limited-time 50% off: APIYI matches the official discount — $0.30 input / $1.20 output per 1M tokens (0-512K tier), ending June 8, 2026, 00:00 (UTC+8)
- Stack recharge bonuses: combined with APIYI recharge promotions, the effective price drops to roughly 41% of list (50% ÷ 1.2)
MiniMax officially released M3 on June 1, 2026, pledging open weights and a technical report on Hugging Face and GitHub within 10 days. Sources:
minimax.io/blog/minimax-m3, venturebeat.com, openrouter.ai/minimax/minimax-m3. Data retrieved: 2026-06-05.Background
MiniMax’s M series has always pushed the “long context + high cost-efficiency” frontier, and M3 takes that strategy to a new level. Released on June 1, 2026, M3 is positioned as the first open-weight model to combine frontier coding-agent capability, a 1M-token context window, and native multimodality — including image and video input plus desktop computer operation. The headline innovation is MSA (MiniMax Sparse Attention): replacing full attention with KV-block selection to drastically cut per-token compute at long context. Per official figures, inference at 1M tokens costs roughly 1/20 of the previous generation, with substantially faster prefill and decode. For the first time, “million-token context” is genuinely affordable. The most actionable part for developers is the price: MiniMax launched the model with a limited-time 50% discount, and APIYI has matched it — stackable with recharge promotions for an even lower effective rate.Deep Dive
Core Features
1M-Token Context
Native million-token windowTrue 1M context powered by MSA sparse attention — entire repos, long video scripts, and extended agent trajectories in a single pass.
MSA Sparse Attention
Long-context cost cut to 1/20KV-block selection replaces full attention; officially 4×+ faster prefill/decode than open-source alternatives at long context.
Frontier Coding Agent
SWE-Bench Pro 59.0Ahead of GPT-5.5 and Gemini 3.1 Pro, Terminal-Bench 2.1 at 66.0 — production-ready for real software engineering tasks.
Native Multimodality
Image / video / computer useTrained on interleaved text-image data from inception; supports image and video input plus desktop operation, beating Gemini 3.1 Pro on OmniDocBench.
Benchmark Highlights
Data from the official MiniMax release and third-party coverage:| Benchmark | MiniMax-M3 | Comparison |
|---|---|---|
| SWE-Bench Pro (real-world SWE) | 59.0 | Above GPT-5.5, Gemini 3.1 Pro |
| Terminal-Bench 2.1 (terminal agent) | 66.0 | Leading among open models |
| MCP Atlas (tool use) | 74.2 | — |
| BrowseComp (autonomous browsing) | 83.5 | Above Claude Opus 4.7 (79.3) |
| Claw-Eval (end-to-end agent) | #1 | Top of all tested models |
| OmniDocBench (document multimodal) | Leading | Above Gemini 3.1 Pro |
Technical Specs
Engineering parameters
- Model ID:
MiniMax-M3(case-sensitive) - Attention: MSA (MiniMax Sparse Attention, KV-block selection)
- Context window: 1M tokens (native)
- Multimodality: ✅ image / video input, desktop computer operation
- Billing: tiered by input length (0-512K / above 512K)
- API compatibility: OpenAI ChatCompletions compatible
- Open weights: weights + technical report on Hugging Face / GitHub within 10 days
Practical Usage
Recommended Scenarios
Whole-Repo Coding Agent
1M context + SWE-Bench Pro 59.0 — large-repo refactors and cross-file PR tasks without chunking or retrieval
Autonomous Browsing & Research
Tops BrowseComp at 83.5 — ideal for deep-research agents and automated information gathering
Long Document / Video Understanding
Native multimodality + million-token context — analyze massive contracts, reports, or video content in one pass
Computer-Use Agents
Native desktop operation — build RPA, automated testing, and computer-use agents
Quick Start (OpenAI-Compatible API)
Best Practices
- Model ID capitalization: the ID is
MiniMax-M3— a wrong case returns a 404 model-not-found error - Cost control: keep single-request input under 512K tokens to stay on the lowest tier; summarize/trim for ultra-long tasks
- Temperature:
temperature=0.2 ~ 0.4recommended for agent/coding workloads - Streaming: prefill takes longer at extreme context — enable streaming for better perceived latency
- Discount window: the 50% off rate ends June 8, 00:00 (UTC+8) — schedule heavy evaluations and batch jobs before then
Pricing & Availability
Price Table (USD / 1M tokens, limited-time 50% off)
| Input length tier | Prompt (input) | Completion (output) |
|---|---|---|
| 0 - 512K | $0.3000 | $1.2000 |
| Above 512K | $0.6000 | $2.4000 |
APIYI matches MiniMax’s official limited-time 50% discount — the table above is the current live rate, billed in tiers by input length. The discount ends June 8, 2026, 00:00 (UTC+8); subsequent pricing is to be determined.
Stack with Recharge Promotions
The 50%-off rate stacks with APIYI recharge bonuses, bringing the effective price down to roughly 41% of list (50% ÷ 1.2):Recharge Promotions
See the latest recharge bonus tiers — larger top-ups earn bigger bonuses
Summary & Recommendations
MiniMax-M3 is the first model to pack “open weights + million-token context + multimodal agent” into a single package:- ✅ Benchmark wins: SWE-Bench Pro 59.0 above GPT-5.5 / Gemini 3.1 Pro; BrowseComp 83.5 above Opus 4.7
- ✅ Million-token context, actually affordable: MSA sparse attention cuts 1M-context cost to 1/20 of the previous generation
- ✅ Pricing window: limited-time 50% off + recharge bonuses ≈ 41% of list, ending June 8, 00:00 (UTC+8)
- ✅ Open weights incoming: weights and technical report within 10 days — start with the API now, run offline evals later
- Add
MiniMax-M3to your A/B rotation in Claude Code / Cursor / in-house agents, focusing on whole-repo and long-horizon tasks - Run a full evaluation of long-document / video-understanding workloads during the 50%-off window
- Stack recharge bonuses to push the effective rate to ~41%, and schedule batch jobs before June 8, 00:00 (UTC+8)
Sources & dates
- Official MiniMax release:
minimax.io/blog/minimax-m3 - Third-party coverage:
venturebeat.com,techtimes.com,officechai.com - Pricing page:
openrouter.ai/minimax/minimax-m3 - Data retrieved: 2026-06-05