Documentation Index
Fetch the complete documentation index at: https://docs.apiyi.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
VEO 3.1 is Google’s flagship AI video generation series, producing video with synchronized audio natively — fixed 8-second clips from text prompts or reference images. APIYI exposes VEO 3.1 through a reverse-engineered channel that proxies Google Flow, billed per-clip with both synchronous streaming and async task modes.Sync API
POST /v1/chat/completions, reuses the OpenAI Chat Completions protocol with stream: true for live progress.Async API
POST /v1/videos three-step async flow, supports text-to-video and Frame-to-Video uploads — built for batch management.Why APIYI’s VEO 3.1?
VEO 3.1 is delivered through a reverse-engineered channel (transparent proxy to Google Flow), optimized for production scenarios across price, integration friction, and feature completeness:Price Killer · Far Below Official Pricing
Unlimited Concurrency · Production Scale
Same Per-Clip Pricing + Top-Up Bonuses
Global Zero-Friction Access
api.apiyi.com directly from Mainland China data centers, residential networks, or overseas nodes. Skip the Google Flow cross-border setup entirely.OpenAI-Compatible · Dual-Mode Access
/v1/chat/completions (same as chat models); async uses /v1/videos (OpenAI Video API style). Both protocols drop into your existing SDK / engineering code with zero changes.Professional Support · Enterprise Onboarding
Key Features
Native Synchronized Audio
Generation Speed Leader
-fast series in 30–60 seconds, standard series in 1–2 minutes — 50% faster than Sora 2, ideal for high-throughput content production.Frame-to-Video Creative Mode
-fl suffix models accept 1 reference image (start frame) or 2 (start + end frames) to animate static visuals or generate seamless transitions between two frames.Portrait / Landscape Switching
-landscape model suffix.Live Streaming Progress
/v1/chat/completions + stream: true) returns real-time > 🏃 Progress: XX% text fragments — your frontend can render a progress bar directly.Async Task Model
video_id for independent polling and download — ideal for batch management, resume-on-failure, and long-running background jobs.Pay on Success
Multi-Video Parallel (n parameter)
n parameter generates up to 4 different videos per request (same prompt, multiple results) for variety selection.Pricing
Billed per clip (each clip is a fixed 8-second video). Only successfully generated videos are billed — failed tasks are free.HD Series (720p, Live)
| Model | Description | Resolution | Price |
|---|---|---|---|
veo-3.1 | Default portrait | 720×1280 | $0.25 |
veo-3.1-fl | Portrait + Frame-to-Video | 720×1280 | $0.25 |
veo-3.1-fast | Portrait + fast | 720×1280 | $0.15 |
veo-3.1-fast-fl | Portrait + fast + Frame-to-Video | 720×1280 | $0.15 |
veo-3.1-landscape | Landscape | 1280×720 | $0.25 |
veo-3.1-landscape-fl | Landscape + Frame-to-Video | 1280×720 | $0.25 |
veo-3.1-landscape-fast | Landscape + fast | 1280×720 | $0.15 |
veo-3.1-landscape-fast-fl | Landscape + fast + Frame-to-Video | 1280×720 | $0.15 |
4K Series (Rolling Out)
- Per-clip billing: Each 8-second video is a fixed unit price, independent of prompt length, reference images, or
n(n=2 means billed for 2 clips) - Failures are free: Tasks ending in
failed/ content-policy rejection / gateway errors are not billed — retry safely - Top-up bonuses: See Top-Up Promotions
Technical Specs
| Dimension | Spec |
|---|---|
| Base model name | veo-3.1 (HD) / 4K series TBD |
| Variant axes | Orientation (portrait/landscape) × Speed (standard/fast) × Mode (text-only / Frame-to-Video -fl) |
| Video duration | Fixed 8 seconds (not adjustable) |
| HD resolutions | Portrait 720×1280, landscape 1280×720 |
| 4K resolutions | Rolling out, specs TBD |
| Audio track | ✅ Synchronized native audio |
| Frame-to-Video (-fl) | ✅ Models with -fl suffix; 1 image (start frame) or 2 images (start + end) |
| Sync generation time | -fast series 30–60 sec, standard series 1–2 min |
| Sync progress streaming | ✅ /v1/chat/completions + stream: true |
| Async polling | ✅ /v1/videos + task ID + /content download |
n parameter | Sync mode max 4 per request (async mode recommended at 1) |
| Video URL TTL | 24 hours |
API Endpoints
| Endpoint | Method | Purpose | Content-Type |
|---|---|---|---|
/v1/chat/completions | POST | Sync streaming generation (recommended for real-time UX) | application/json |
/v1/videos | POST | Async task: submit text-to-video or Frame-to-Video | application/json or multipart/form-data |
/v1/videos/{video_id} | GET | Async poll task status | — |
/v1/videos/{video_id}/content | GET | Async download video URL | — |
Key Parameters
Model Variant Naming Rules
VEO 3.1 toggles capabilities via model name suffixes — not separate parameters:| Suffix | Effect | Default (no suffix) |
|---|---|---|
-landscape | Landscape (1280×720) | Portrait (720×1280) |
-fast | Fast tier (speed-first, lower price) | Standard tier |
-fl | Frame-to-Video (requires uploaded image) | Pure text-to-video |
veo-3.1— Standard portrait text-to-video (default)veo-3.1-landscape-fast— Fast landscape text-to-video (best value)veo-3.1-landscape-fl— Standard landscape Frame-to-Videoveo-3.1-landscape-fast-fl— Fast landscape Frame-to-Video (cheapest image-to-video)
n (Number of Videos per Sync Request)
- Range:
1to4, default1 - Only the sync mode (
/v1/chat/completions) supportsn; async mode ignores it - Billed per video (n=2 means billed for 2 clips)
Best Practices
Validate prompts with -fast first
veo-3.1-fast or veo-3.1-landscape-fast first ($0.15, 30–60 seconds), then switch to standard tier for the final asset.Pick orientation by use case
- Social-media short-form (TikTok, Reels) → portrait (no
-landscape) - YouTube / ads / product demos → landscape (
-landscape)
Frame-to-Video prompts focus on "motion"
-fl models already define visuals (start frame or start+end frames). The prompt should focus on how the image animates: camera motion, object motion, lighting changes, character expressions. Example: "Camera slowly pushes in, leaves gently swaying, sunlight flickering through branches".Frame-to-Video shines for "transitions"
Client timeout ≥ 2 minutes
-fast ≈ 60 sec, standard ≈ 2 min) — set client timeout to 120 seconds minimum. Async POST submission is sub-second, but use 30 seconds as a baseline.Download videos immediately
completed to avoid expired links.Error Codes & Retries
| Status | Meaning | Recommended Action |
|---|---|---|
400 | Invalid parameters (model name doesn’t exist, -fl missing image, n out of range) | Validate parameters; Frame-to-Video must use multipart upload |
401 / invalid_api_key | Invalid API Key | Check Bearer Token; verify console group setting |
403 | Content-policy rejection | Adjust prompt; ensure reference images are non-sensitive |
429 / quota_exceeded | Rate limit / quota exceeded / insufficient balance | Exponential backoff; contact sales for higher quota |
5xx | Gateway / upstream error | Retry async tasks 1–2 times (no charge) |
Task failed | Generation failed (mostly content policy or upstream capacity) | Adjust prompt and retry; failed task is not billed |
video_not_found | video_id doesn’t exist or has expired | Verify ID; query within 24 hours |
- Sync request timeout: 120 seconds baseline (standard tier);
-fastcan drop to 60 seconds - Async POST submission timeout: 30 seconds; GET polling interval 5–10 seconds, max wait 10 minutes
- Exponential backoff retries on 5xx and
failedtasks (recommend 2 retries) - Log the
x-request-idresponse header for debugging
FAQ
Is VEO 3.1 official-relay or reverse-engineered? Is an official channel available?
Is VEO 3.1 official-relay or reverse-engineered? Is an official channel available?
VEO 3.1 vs Sora 2 — which should I choose?
VEO 3.1 vs Sora 2 — which should I choose?
| Dimension | VEO 3.1 | Sora 2 (Official) |
|---|---|---|
| Price | $0.15–$0.25 / 8 sec (per clip) | $0.40–$8.40 / 4–12 sec (per second) |
| Duration | Fixed 8 sec | 4 / 8 / 12 sec |
| Generation time | 30 sec – 2 min | 3–10 min |
| Audio | ✅ Native sync | ✅ Native sync |
| Frame-to-Video | ✅ -fl series | ✅ input_reference single image |
| Stability | Reverse-engineered, subject to risk control | Official 99.99% |
| Resolution | 720p (4K rolling out) | 720p / 1024p / 1080p |
Why is the video duration fixed at 8 seconds? Can I extend it?
Why is the video duration fixed at 8 seconds? Can I extend it?
-fl models using each clip’s end frame as the next clip’s start frame, then stitch with ffmpeg.How do I choose between standard and -fast?
How do I choose between standard and -fast?
- Highest quality / hero assets → standard (
veo-3.1/veo-3.1-landscape), $0.25/clip - Volume / experimentation / internal preview → fast (
-fastsuffix), $0.15/clip, faster - Quality difference between fast and standard is small — fast tier is sufficient for most production use cases
How do Frame-to-Video (-fl) models work?
How do Frame-to-Video (-fl) models work?
-fl series requires input_reference image upload:- 1 image → start-frame mode: image becomes the video’s opening, AI generates subsequent frames
- 2 images → start + end mode: first image opens, second image closes, AI generates the transition
Are failed generations billed?
Are failed generations billed?
failed, content-policy rejections, gateway 5xx errors, and parameter errors are all not billed. Only videos that actually complete (with a returned URL) are billed.How long are video URLs valid?
How long are video URLs valid?
How do I read progress in sync streaming mode?
How do I read progress in sync streaming mode?
/v1/chat/completions + stream: true returns SSE format with progress text in each chunk:delta.content. Full example in Sync API.Which image formats are supported? Reference image size limits?
Which image formats are supported? Reference image size limits?
-fl models accept jpeg / png for input_reference, recommended size ≤ 5 MB per image. No strict resolution requirement (unlike Sora 2), but the image aspect ratio should match the target video orientation: portrait video → portrait image, landscape → landscape; otherwise the AI will auto-crop.Can I use the official OpenAI SDK?
Can I use the official OpenAI SDK?
client.videos.create(), but Frame-to-Video must use raw requests for multi-file upload (the OpenAI SDK only handles single-file uploads natively).Can I run multiple tasks in parallel? What are the rate limits?
Can I run multiple tasks in parallel? What are the rate limits?
/v1/videos returns an independent video_id. Submit and poll in parallel. Default quota covers most business needs; for enterprise batch use cases (>10 concurrent, >100 clips per day), contact sales for a dedicated resource pool.Can I cancel a running task?
Can I cancel a running task?
-fast first to avoid wasting standard-tier runs.Can I disable the audio track?
Can I disable the audio track?
ffmpeg -i input.mp4 -an output.mp4.When does the 4K version launch? What's the price?
When does the 4K version launch? What's the price?
Related Docs
- Sync API —
/v1/chat/completions+stream: truelive streaming, text-to-video + Frame-to-Video samples - Async API —
/v1/videosthree-step async flow, Frame-to-Video upload, full Python client example - Sora 2 Video Generation — OpenAI official-relay channel comparison
- Top-Up Promotions — Bonus tiers and applicable channels
- API Manual — General request, timeout, and retry guidance
- Google official Veo introduction:
deepmind.google/technologies/veo/