Skip to main content

Overview

doubao-seedance-2-0-260128 (standard) and doubao-seedance-2-0-fast-260128 (fast) are ByteDance’s latest video generation models, served through APIYI on official Volcengine Mainland China resources (not the BytePlus international edition) with upstream content-safety built in. They support text-to-video, first/first+last frame image-to-video, 1-9 reference images, and reference video/audio inputs — and can generate voice, sound effects, and background music synchronized with the visuals.
🎬 Highlights: 4-15 s controllable duration (or -1 for model-chosen length), three resolution tiers (480p/720p/1080p), 6 aspect ratios plus adaptive, synchronized audio on by default, and multilingual prompts (Chinese, English, Japanese, Spanish, Portuguese, Indonesian). Built for short-video production, e-commerce assets, motion design, and virtual-human content at scale.

Video Generation API Reference

POST /seedance/api/v3/contents/generations/tasks — async task endpoint with an interactive Playground and full polling/download code.

API Manual

Token creation, base URL, billing models, and general calling conventions.

Why APIYI’s Seedance 2.0?

Benchmarked against the Volcengine official channel, optimized for production workloads across stability, cost, and onboarding:

Official Resource · Mainland Edition

Official Volcengine Mainland China resources (not BytePlus international), with upstream content safety built in. Parameters, responses, and billing match the official API exactly.

Unlimited Concurrency · No Queuing

In our tests, 15 simultaneous tasks all entered running immediately with zero queuing (measured 2026-06-06 (UTC+8)) — ready for batch production at scale.

Official Pricing + Top-up Bonuses

Unit prices aligned with Volcengine’s official list (on-platform billing runs roughly 10% higher); combined with top-up bonuses the effective cost is about on par with the official channel — without the CNY 200 activation deposit or enterprise verification.

Zero-friction Global Access

No Volcengine account or identity verification required. Mainland China data centers, residential networks, and overseas nodes can all reach api.apiyi.com directly with a single Token.

Full Video Model Lineup

VEO 3.1, Sora 2, and Wan2.7 are available on the same platform — mix and match per use case.

Professional Support

A team experienced in video-generation workloads, providing model selection, tuning, and integration support from PoC to production.

Key Features

Three Tiers · Same Price per Tier

480p / 720p / 1080p (1080p standard model only). Within a tier, 16:9, 9:16, 1:1, and every other ratio share the same pixel area and the same price — switch between landscape and portrait at zero cost.

Synchronized Audio by Default

generate_audio defaults to true: voice, sound effects, and background music are generated to match the visuals. Put spoken lines in double quotes to improve voice-over quality.

4-15 s Controllable Duration

duration accepts whole seconds from 4 to 15, or -1 to let the model pick a length (billed by actual output). Fixed 24 fps.

Multilingual Prompts

Chinese (up to ~500 chars) and English (up to ~1000 words), plus Japanese, Spanish, Portuguese, and Indonesian.

First / First+Last Frame

Animate a single image as the first frame, or pin both first and last frames with two images. Combine with return_last_frame to chain clips into longer continuous videos.

Multi-modal Reference

Mix 1-9 reference images with optional reference video and audio (the three image modes are mutually exclusive) to keep characters and style consistent.

Async Task Flow

Submit and get a task_id, poll for status, then download the mp4 from content.video_url (link valid for 24 hours).

Reproducible Seeds

Fix seed for similar results across runs. watermark defaults to false — output is watermark-free.

Pricing

Token-based billing: tokens ≈ duration(s) × width × height × 24 / 1024 (verified in our tests to within 0.1%). Since every ratio in a tier has the same pixel area, price depends only on the resolution tier and duration. Official public pricing (16:9 / 5 s / no input video):
ResolutionFastStandard
480pCNY 1.86CNY 2.31
720pCNY 4.00CNY 4.97
1080pNot supportedCNY 12.39
Measured on-platform reference (tested 2026-06-06 (UTC+8), per 5-second video with default audio, for estimation only):
ResolutionPer 5 sPer secondPer second (¥)Best priceBest price (¥)
480p≈$0.37$0.073/s¥0.51/s$0.061/s¥0.43/s
720p≈$0.79$0.157/s¥1.10/s$0.131/s¥0.92/s
1080p≈$1.77$0.354/s¥2.48/s$0.295/s¥2.07/s
Note: “Best price” is the per-second price ÷ 1.2, i.e. the effective rate with the up-to-20% top-up bonus; CNY converted at the fixed 1:7 rate.
Billing notes:
  • Final charges follow the console’s model pricing and call logs. Due to tax, FX, and upstream settlement, on-platform billing may run roughly 10% above the official list; with top-up bonuses the effective cost is about on par with the official channel
  • Tasks are pre-charged on submission and settled on completion — your balance fluctuates briefly; reconcile against call logs
  • Rejected requests (HTTP 400 parameter errors, etc.) are not billed (verified)
  • Cost scales linearly with duration: a 15 s video costs about 3× a 5 s one
  • Fast and standard bill at the same rate on APIYI — choose fast for speed, standard for 1080p
Beta-supply notice: Seedance 2.0 is currently in a beta supply phase. If your actual charges deviate noticeably from the table above, contact customer support and we will reconcile. Pricing will be adjusted dynamically with upstream policy (e.g. if a lower-priced official variant ships later) and APIYI’s supply capacity; capable channel partners are welcome to reach out. This model is priced to secure supply and serve customers, not for profit.

Group Setup

Seedance 2.0 runs on the dedicated SeeDance2 group (0.18x rate, CNY-denominated), with two hard requirements: ① the Token’s billing model must be Pay-as-you-go Priority (or Pay-as-you-go) — Pay-per-request tokens cannot route; ② the Token must have the SeeDance2 group enabled. Tokens on the Default group or other video groups will fail with “no available channel for this model”.
GroupRateWhen to use
SeeDance20.18xThe only group serving Seedance 2.0 — ample concurrency, no queuing
Two recommended Token configurations:
SetupBest forHow
A. One shared TokenPersonal projects, mixed model usageAdd SeeDance2 to your existing Token’s group list; keep billing model as Pay-as-you-go Priority
B. Dedicated TokenProduction workloads, separate billingCreate a Token with only the SeeDance2 group — cleaner reporting, quota alerts per business line
For production we recommend B (dedicated Token): clean billing, per-line quota control, and easier troubleshooting when usage spikes.

Technical Specs

DimensionValue
Modelsdoubao-seedance-2-0-260128 (standard) / doubao-seedance-2-0-fast-260128 (fast)
Resolutions480p / 720p / 1080p (no 1080p on fast)
Aspect ratios16:9 4:3 1:1 3:4 9:16 21:9 adaptive (default adaptive)
Duration4-15 whole seconds, or -1 model-chosen (default 5)
Frame rateFixed 24 fps (frames parameter not supported)
Audiogenerate_audio defaults to true; mono
Input imagesjpeg/png/webp/bmp/tiff/gif/heic/heif; aspect ratio (0.4, 2.5); sides (300, 6000) px; under 30 MB each
Input video/audioSeedance 2.0 only; audio wav/mp3, 2-15 s per clip, up to 3 clips, must accompany an image or video
Generation time (measured)5 s @720p: ~2-5 min; 1080p: ~3 min; 15 s: ~4.5 min
Response fieldscontent.video_url (mp4 direct link, expires in 24 h), usage.completion_tokens
Task retentiontask_id queryable for 7 days

API Endpoints

EndpointPurposeContent-Type
POST /seedance/api/v3/contents/generations/tasksCreate a video generation taskapplication/json
GET /seedance/api/v3/contents/generations/tasks/{id}Poll task status / fetch the video URL
Domains: api.apiyi.com is the primary gateway; vip.apiyi.com and other platform domains behave identically. The path prefix is /seedance/api/v3do not drop the /api segment, and do not use /v1/videos.

Resolutions & Aspect Ratios in Detail

A resolution tier defines the pixel area, not the short side. Actual output dimensions per ratio (official values, verified in our tests):
Ratio480p720p1080p (standard only)
16:9864×4961280×7201920×1080
4:3752×5601112×8341664×1248
1:1640×640960×9601440×1440
3:4560×752834×11121248×1664
9:16496×864720×12801080×1920
21:9992×4321470×6302206×946
adaptiveModel picks one of the above based on the inputSameSame

How adaptive works

  1. Text-to-video: the model infers the best ratio from your prompt
  2. First / first+last frame: matches the first-frame image’s ratio (mismatched images are center-cropped)
  3. Multi-modal reference: follows prompt intent, otherwise the first media item (video takes priority over images)
  4. The actual ratio used is returned in the task response’s ratio field
ratio only accepts the 7 enum values above — passing e.g. "2:1" returns an InvalidParameter error (verified), as does a duration outside 4-15. Neither is billed.

Best Practices

1

Pick the model by output needs

Choose the standard model doubao-seedance-2-0-260128 for 1080p or maximum quality; choose fast for batch production (same price on APIYI — its advantage is speed).
2

Use adaptive to avoid cropping

For image-to-video keep the default adaptive so the model matches your source image’s ratio. Lock 9:16 (portrait) or 16:9 (landscape) only when the target platform demands it.
3

Duration is your cost dial

Cost scales linearly with length. Validate prompts with 5 s clips first, then scale to 10-15 s; use duration: -1 when pacing is best left to the model.
4

Turn audio off when you don't need it

generate_audio defaults to true. Pass false for silent footage you plan to score yourself.
5

Quote dialogue for better voice-over

Put spoken lines inside double quotes in the prompt — the model generates matching voices automatically.
6

Add Accept-Encoding: identity in HTTP clients

The gateway labels responses content-encoding: gzip while the body is uncompressed; auto-decompressing clients such as Python requests raise ContentDecodingError. Adding the Accept-Encoding: identity header avoids this (curl is unaffected).
7

Poll every 15-30 s and download immediately

Tasks typically finish in 2-5 minutes. content.video_url is a signed link valid for 24 hours — copy the file to your own storage as soon as the task succeeds.
8

Chain clips with return_last_frame

Set return_last_frame: true to get a watermark-free last-frame png, then use it as the next task’s first frame to build continuous multi-clip videos.

Error Codes & Retries

CodeMeaningSuggested handling
400InvalidParameter: bad resolution/ratio/duration (e.g. fast + 1080p)The message names the offending parameter — fix per the tables above; not billed
401Invalid TokenCheck the Bearer Token
403Content moderation rejection (real faces, policy violations)Change the assets or prompt
429Rate limited / insufficient quotaExponential backoff; check balance
5xxGateway / backend errorRetry 1-2 times
Task failedGeneration failedInspect the task’s error field; retry with a different seed if needed
Task expiredExceeded execution_expires_after (default 48 h)Resubmit
Client recommendations:
  • 30-60 s request timeouts are enough for create/poll calls (the wait happens on the task side)
  • Poll every 15-30 s with an overall budget of 15+ minutes (longer for 1080p / 15 s tasks)
  • Apply exponential backoff on 5xx and timeouts (2 retries)
  • Log the task id and the x-request-id response header for troubleshooting

FAQ

The most common Seedance 2.0 error: your Token does not have the SeeDance2 group enabled. Tokens on the Default group or other video groups cannot route to this model. Enable the SeeDance2 group in Token Settings and use the Pay-as-you-go Priority billing model.
The gateway marks responses content-encoding: gzip but the body is not compressed. Add "Accept-Encoding": "identity" to your request headers; curl and browser fetch are unaffected.
generate_audio defaults to true (verified): the model adds voice, sound effects, and background music automatically. Pass "generate_audio": false explicitly for silent output.
On success the URL is at content.video_url in the poll response (not top-level). It is a signed link valid for ~24 hours — download and re-host it immediately. The task_id itself remains queryable for 7 days.
The state machine is queued → running → succeeded / failed / expired. The success state is succeeded, not completed — an easy mistake when migrating from other video APIs.
No. Seedance 2.0 rejects reference images/videos containing real human faces (upstream content safety). Alternatives: reuse face-containing output generated by Seedance models within the last 30 days, use the platform’s preset virtual avatars (asset:// IDs), or use licensed face assets.
Parameter rejections (HTTP 400) are not billed (verified). Billing is pre-charged on submit and settled on completion, so your balance fluctuates briefly — reconcile against call logs.
tokens ≈ duration(s) × width × height × 24 / 1024, verified to within 0.1%. Every ratio in a tier has the same pixel area (720p 16:9 and 9:16 both cost 108,900 tokens per 5 s) — landscape, portrait, and square all cost the same.
Both bill at the same rate on APIYI. Fast generates quicker but caps at 720p. Pick standard for 1080p or maximum detail; pick fast for high-volume 720p-and-below production.
The model picks a length between 4 and 15 s (our test produced a 10 s video) and bills by actual output. The final length is returned in the task’s duration field. Fix the duration explicitly if cost predictability matters.
No. frames and camera_fixed are Seedance 1.x parameters — not supported by the Seedance 2.0 series. Use whole-second duration instead.
No — they are three mutually exclusive modes: first frame (1 image), first+last (2 images with required first_frame/last_frame roles), and multi-modal reference (1-9 images, all reference_image). To approximate “first/last frame + reference”, use reference mode and designate a frame via the prompt.
The SeeDance2 group has ample concurrency with no queuing (15 simultaneous tasks all ran immediately in our test). Contact sales for larger sustained workloads.
Keep prompts under ~500 Chinese characters or ~1000 English words — longer prompts dilute detail. Supported languages: Chinese, English, Japanese, Spanish, Portuguese, Indonesian. Describe subject + action + camera movement + lighting/style.
Seedance 2.0 is one of the few first-tier 2026 video models that outputs synchronized audio by default. Combined with same-price aspect ratios and a 15-second ceiling, it is a strong primary channel for short-video and e-commerce asset production. To compare alternatives, the same Token (with extra groups enabled) can call Sora 2, VEO 3.1, and Wan2.7 directly.