Documentation Index
Fetch the complete documentation index at: https://docs.apiyi.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Wan (Tongyi Wanxiang) is Alibaba Cloud’s video generation model series. APIYI connects directly to Alibaba Cloud Model Studio through a DashScope passthrough channel, so a single APIYI Key (starting withsk-) unlocks all Wan video capabilities with no separate Alibaba Cloud account required. The current flagship is Wan2.7, covering four core use cases:
| Use case | Model ID | Your input | Output |
|---|---|---|---|
| Text-to-video | wan2.7-t2v | A text prompt | 5-15 second short video |
| Image-to-video | wan2.7-i2v | First frame + prompt (optional driving audio) | Bring a static image to life; add audio for lip-sync / rap |
| Reference-to-video | wan2.7-r2v | 1-5 reference images/videos + prompt | Single- or multi-character video that preserves reference subjects, with voice reference |
| Video edit | wan2.7-videoedit | A video + 1-5 reference images + edit instruction | Edited video: outfit swap, background swap, etc. |
model field. Native support for 720P / 1080P resolutions and 2-15 second integer durations; wan2.7-i2v also supports driving audio for lip-sync. Ideal for short-video production, e-commerce assets, digital-human narration, and creative marketing.Text-to-Video API
wan2.7-t2v generates video from a pure text prompt, the simplest entry point.Image-to-Video API
wan2.7-i2v takes a first frame + optional driving audio for lip-sync / rap.Reference-to-Video API
wan2.7-r2v preserves subject features from reference images/videos, with voice reference.Video Edit API
wan2.7-videoedit edits a video with reference images: outfit swap, background swap, etc.Why use Wan on APIYI
One Key for every capability
Direct access, no VPN
api.apiyi.com, reachable from mainland data centers and home networks alike, with no need to configure an Alibaba Cloud regional endpoint.No charge on failure
failed (unreachable media URL, sensitive prompt, upstream capacity, etc.) are not billed, so retry freely.DashScope protocol passthrough
Core features
Four-in-one async endpoint
POST /wan/api/v1/...video-synthesis. Submit, get a task_id, poll, and download. Easy batch management.Audio-driven lip-sync
wan2.7-i2v supports driving_audio, making a static portrait match the audio’s mouth movements and rhythm. Great for rap / narration / digital humans.Multi-subject reference
wan2.7-r2v mixes reference images + reference videos (5 total max), referenced in the prompt as “image 1 / video 1”, with voice reference support.Multiple resolutions and durations
prompt_extend smart rewriting further improves quality for short prompts.Supported models
| Model ID | Capability | Required media input | Notes |
|---|---|---|---|
wan2.7-t2v | Text-to-video | None | Pure text generation |
wan2.7-i2v | Image-to-video | first_frame (+ optional driving_audio) | The only capability supporting audio drive |
wan2.7-r2v | Reference-to-video | reference_image / reference_video (5 total max) | Supports reference_voice voice reference |
wan2.7-videoedit | Video edit | video + reference_image (1-5) | Edit model name has no hyphen |
⚠️ Endpoint choice (most important)
APIYI mounts two paths, but only the DashScope passthrough endpoint fully supports every Wan capability:| Path | Protocol style | i2v / r2v availability | Verdict |
|---|---|---|---|
/v1/videos | OpenAI flat style | ❌ Media fields are dropped | Do not use |
/wan/api/v1/services/aigc/video-generation/video-synthesis | DashScope native passthrough | ✅ Fully supported | Always use this |
Async call flow
The whole flow is three async steps: create task → poll status → download video.Create the task
POST /wan/api/v1/services/aigc/video-generation/video-synthesis with the header X-DashScope-Async: enable. It returns a task_id immediately.Poll the status
GET /v1/tasks/{task_id} (with Authorization), once every 5-10 seconds (never less than 3 seconds), until status becomes completed.Task status reference
The top-levelstatus field of the GET /v1/tasks/{task_id} response (already normalized by APIYI):
| Status | Meaning | Next step |
|---|---|---|
submitted | Submitted, queued | Keep polling |
in_progress | Generating | Keep polling (progress often stalls at 30%; that is the upstream’s coarse reporting, not a hang) |
completed | Success | Download from result_url |
failed | Failed | Check error.message / fail_reason |
Full Python client
Key parameters explained
When submitting, the body uses DashScope’s nested structure:{ model, input: { prompt, media[] }, parameters: {...} }.
input fields
| Field | Type | Required | Notes |
|---|---|---|---|
prompt | string | ✓ | Natural-language description; wan2.7-r2v supports “image 1 / video 1” markers to reference media |
negative_prompt | string | Negative prompt, ≤500 characters | |
media | array | Required for i2v/r2v/edit | Media asset array, see below |
media[] types
type | Purpose | Applicable models |
|---|---|---|
first_frame | First frame image (≤1) | i2v, r2v |
reference_image | Reference image (preserve subject/scene) | r2v, videoedit |
reference_video | Reference video (subject/voice reference) | r2v |
driving_audio | Driving audio (lip-sync) | i2v only |
video | Input video | videoedit |
reference_voice | Voice reference (attached to reference_image/video) | r2v |
type + url. The url must be a public https link that can be fetched directly with GET (upload local files to OSS / CDN first).
parameters fields
| Field | Type | Values | Notes |
|---|---|---|---|
resolution | string | 720P / 1080P | Uppercase; specifying it explicitly is recommended |
ratio | string | 16:9 / 9:16 / 1:1 / 4:3 / 3:4 | Aspect ratio; ignored automatically when a first frame is supplied |
duration | int | 2-15 | Seconds (integer), commonly 5 / 10; capped at 10 when a reference video is included |
prompt_extend | bool | true / false | Smart prompt rewriting, strongly recommend true |
watermark | bool | true / false | ”AI generated” watermark in the bottom-right corner |
seed | int | 0-2147483647 | Fixing it improves reproducibility |
Choosing between Wan and HappyHorse
Wan and HappyHorse are both Alibaba video models and share the same endpoint and schema (swap them by changing only themodel name), but their strengths differ:
| Dimension | Wan2.7 | HappyHorse-1.0 |
|---|---|---|
| Audio-driven lip-sync (i2v) | ✅ wan2.7-i2v supports driving_audio | ❌ Not supported, i2v takes a first frame only |
| Reference-to-video image cap | reference image + reference video, 5 total max | up to 9 reference images |
| Video-edit reference images | ≤5 | ≤5 |
| Subject-consistency style | Multi-subject interaction, voice reference | Leans toward “faithful reproduction of dynamic footage”, keeps subjects stable |
Best practices
Iterate first at 720P / 5 seconds
Always enable prompt_extend
prompt_extend: true clearly improves quality for short prompts, at the cost of only a few extra seconds of generation time.Poll every 5-10 seconds
Set a 20-minute client timeout as a backstop
Download as soon as you get result_url
result_url expires in 24 hours by default and is an OSS signed direct link, so do not send the Authorization header when downloading. In production, always re-store it to your own OSS / CDN.Error codes and retries
Errors come from two stages and are handled differently:| Source | Signature | Handling |
|---|---|---|
| Creation stage (rejected by APIYI) | HTTP 4xx/5xx, type is task_error / parse_request_failed / build_request_failed | Fix the body and retry (usually a wrong field type, missing media, or wrong endpoint) |
| Execution stage (rejected by upstream Alibaba Cloud) | Task ends as status=failed, error.message prefixed with [InvalidParameter] / [InvalidImageUrl] etc. in brackets | Read the bracketed hint; usually an unreachable media URL or a sensitive prompt |
failed task with [InvalidImageUrl] can be retried (possibly a transient network issue), while [InvalidParameter] / sensitive words should not be retried.FAQ
Why can't I use /v1/videos to submit Wan tasks?
Why can't I use /v1/videos to submit Wan tasks?
/v1/videos is an OpenAI flat-style endpoint with incomplete support for Wan’s i2v / r2v: media fields like media get dropped, and upstream Alibaba Cloud returns [InvalidParameter] Field required: input.media. All Wan video creation requests go to /wan/api/v1/services/aigc/video-generation/video-synthesis, and queries always go to /v1/tasks/{task_id}.What does the X-DashScope-Async: enable header do? Is it required?
What does the X-DashScope-Async: enable header do? Is it required?
current user api does not support synchronous calls. The query call (GET) does not need this header.Why query at /v1/tasks/{id} instead of /wan/api/v1/tasks/{id}?
Why query at /v1/tasks/{id} instead of /wan/api/v1/tasks/{id}?
/v1/tasks/{task_id}. No matter which path you used to create the task, you query it through this one endpoint, and the response’s top-level status / progress / result_url / error fields are consistent.result_url download returns 403 / SignatureDoesNotMatch, what now?
result_url download returns 403 / SignatureDoesNotMatch, what now?
Authorization header. result_url is already an Alibaba Cloud OSS pre-signed direct link; adding an APIYI Key makes OSS reject it:What if result_url has expired?
What if result_url has expired?
/v1/tasks/{task_id} and you usually get a fresh result_url, but the task_id’s own query validity is also 24 hours (returns UNKNOWN after that). For long-term storage, download to your own storage as soon as possible.progress is stuck at 30%, is it hung?
progress is stuck at 30%, is it hung?
status is still in_progress, keep waiting; it usually jumps straight from 30% to 100%.How many tasks can one Key run concurrently?
How many tasks can one Key run concurrently?
Are failed tasks billed?
Are failed tasks billed?
status=failed is not billed. But note: resubmitting the same task bills again, so make it idempotent. During testing you can turn off prompt_extend and use 720P / 5 seconds / short prompts to lower the unit cost.Is wan2.6 still usable?
Is wan2.6 still usable?
wan2.6-r2v-flash) is still on the callable list, with the same protocol as Wan2.7; just change the model name. See Historical Versions.Group Setup
The Wan and HappyHorse series share a singleWan group — one Token can call both series (the Token in the screenshot is named Wan2.7&HappyHorse). Video models are billed per second, so the Token must meet two conditions to route successfully:
- Billing model: choose Pay-as-you-go Priority or Pay-as-you-go — video is billed per second, so Pay-per-request Tokens cannot route
- Group: select a group that includes
Wan

Pricing
Default price = 98% of Alibaba’s official price (simple to reason about)
In the console theWan group shows a rate of 0.14x, which is denominated in the built-in RMB pricing unit. Because APIYI bills in USD at a fixed 1:7 exchange rate, the effective conversion is:
Conversion: USD price per second = official RMB price × 0.14 (i.e.× 0.98 ÷ 7). For example, the official 1080P price of ¥1.0/s → $0.14/s, exactly the0.14xshown in the console.
Price detail (default price, billed per second)
Wan2.7 text-to-video / image-to-video / reference-to-video are priced the same, with two tiers —720P / 1080P (480P is not supported):
| Resolution | Official price | Our default /s | 5 s | 10 s | 12 s |
|---|---|---|---|---|---|
720P | ¥0.6/s | $0.084/s | $0.42 | $0.84 | $1.01 |
1080P | ¥1.0/s | $0.14/s | $0.70 | $1.40 | $1.68 |
wan2.7-r2vdefaults to1080P, and duration is capped at 10 seconds when reference media includes a video.wan2.7-videoedit(video edit) output duration follows the source video and is billed by actual output seconds, not byduration.- Prices shown are the default (98% of official); with the maximum top-up bonus, the effective price is roughly the table value ÷ 1.2 (e.g. 1080P 5 s $0.70 → about $0.58).
Stack top-up bonuses for an even lower effective price
After joining the top-up bonus program, credited balance can be boosted up to ~1.2x, pushing the effective price lower still:| Tier | Effective price (vs Alibaba official) | Formula |
|---|---|---|
| Default | 98% | rate 0.14x × fixed exchange rate 7 |
| With top-up bonuses (max tier for large customers) | ~81.6% | 0.98 ÷ 1.2 |
- Billing dimension = resolution tier × duration (seconds); failed tasks are not billed.
- 1:7 is a fixed settlement exchange rate (not a preferential rate); it applies uniformly to all USD top-ups.
- For the highest bonus tiers and eligible channels, see top-up bonuses. The latest rate is authoritative in the console.
Related docs
Text-to-Video Playground
wan2.7-t2v live debugging + code samplesImage-to-Video Playground
wan2.7-i2v first frame + driving audioReference-to-Video Playground
wan2.7-r2v multi-subject reference + voiceVideo Edit Playground
wan2.7-videoedit outfit / background swapHistorical Versions (Wan2.6)
HappyHorse Series
help.aliyun.com/zh/model-studio/text-to-video-api-reference. For questions or suggestions, please open a ticket in the APIYI console.