VEO 3.1 Official Video Generation

Overview

VEO 3.1 Official is APIYI’s official-relay channel for Google Veo 3.1 — a transparent passthrough to Google AI Studio’s veo-3.1-generate-preview / veo-3.1-fast-generate-preview async endpoints, with model IDs, response fields, and constraints identical to the upstream. Pay-per-request billing, callable on the Default group — the lowest-friction official-quality Veo 3.1 channel available today.

🎬 Highlights: Transparent passthrough to Google AI Studio + native synchronized audio + flexible 4 / 6 / 8 second durations + three resolution tiers (720p / 1080p / 4k) + per-request billing from $0.3 + Default group + Pay-per-request or Pay-as-you-go Priority Tokens (no dedicated group needed; pure Pay-as-you-go is not supported). Suited for ad shorts, e-commerce assets, social-media content, and product demos that need official-grade quality with the simplest possible onboarding.

⚠️ No CDN URL returned — you must download the MP4 stream yourself: The channel currently does not return any distributable public / CDN URL. Once status: "completed", call GET /v1/videos/{task_id}/content to fetch the MP4 binary and store it in your own OSS / CDN before serving to end users. Browsers cannot hit /content directly (auth header required). See API Endpoints below.

Text-to-Video API

POST /v1/videos, generate video from text only — JSON request body, the simplest entry point.

Image-to-Video API

POST /v1/videos + multipart upload of input_reference to animate a static image into a clip.

Official vs Reverse

Decision matrix against the existing VEO 3.1 (Reverse Channel).

Why APIYI’s VEO 3.1 Official?

Drop-in replacement for the Google official / Vertex AI channel, optimized for production scenarios across onboarding friction, stability, and cost:

Official Passthrough · Identical Model IDs

Transparent passthrough to Google AI Studio’s Veo 3.1 async endpoints. Model IDs (veo-3.1-generate-preview / veo-3.1-fast-generate-preview) match the upstream exactly, with one-to-one alignment on request and response fields and constraints.

Zero-Friction Onboarding · No Group Switching

Calls work on the Default group with Pay-per-request or Pay-as-you-go Priority Tokens (pure Pay-as-you-go is not supported). No dedicated group switch needed; existing Pay-per-request Tokens work as-is — the lowest-friction official-quality channel for Veo 3.1.

Unlimited Concurrency · Production Scale

Aggregated account pool with transparent proxy — scale batch shoots, ad pipelines, and high-volume production linearly. No Google per-account tier ceiling.

Per-Request Pricing · 60%+ Cheaper than Google

veo-3.1-fast-generate-preview $0.3/req, veo-3.1-generate-preview $1.2/req — flat across 4/6/8 sec and 720p/1080p/4k. Vs. Google’s official 8s 1080p, save 62–68%, stack top-up bonuses for further savings; failed tasks not billed.

Global Zero-Friction Access

No overseas server or proxy required — connect to api.apiyi.com directly from Mainland China data centers, residential networks, or overseas nodes. Skip the Google AI Studio / Vertex AI cross-border setup entirely.

Professional Support · Enterprise Onboarding

Our team has deep expertise in video generation: prompt engineering, resolution selection, batch production, and post-processing. Full PoC-to-production technical support for enterprise customers.

Key Features

Native Synchronized Audio

Veo 3.1 natively outputs video with synchronized audio (ambient sound, dialogue, score). No separate audio post-production needed — describe audio intent in your prompt.

Flexible 4 / 6 / 8 Second Durations

duration string enum: "4" / "6" / "8". Per-request billing, duration does not affect price. 1080p / 4k tiers require "8".

Three Resolution Tiers

720p / 1080p / 4k, uniform per-request pricing. Toggle landscape (16:9) and portrait (9:16) freely.

Precise Instruction Following

Veo 3.1 leads its tier on camera motion, object physics, and character expression fidelity. Rich camera-language keyword support (push/pull/pan/dolly, low/high angles).

Image-to-Video (input_reference)

Upload one image as the visual anchor to animate static content. See Image-to-Video.

Async Task Model

Submit returns a task_id immediately. Poll status independently and download the final video — ideal for batch management and resume-on-failure flows.

OpenAI-Compatible Protocol

base_url=https://api.apiyi.com/v1 + Bearer auth. Works via raw HTTP or the OpenAI SDK’s low-level client.post().

Failures Are Free

In async mode, failed generations, content-policy rejections, parameter errors are not billed. Only status=completed tasks are charged.

Pricing

APIYI uses pay-per-request billing — flat price within supported duration/resolution combos, no surcharge for longer or higher-res output. Per ai.google.dev/gemini-api/docs/pricing public rates, Google’s official Veo 3.1 charges per second; the discounts below are computed for 8-second videos.

Model	APIYI Price	Google Official 8s 1080p	Google Official 8s 4K
`veo-3.1-fast-generate-preview`	$0.3 / req	$0.96 68.8% off	$2.40 87.5% off
`veo-3.1-generate-preview`	$1.2 / req	$3.20 62.5% off	$4.80 75.0% off

Billing notes:

Charged per request by model name, independent of duration (4/6/8 sec), resolution (720p/1080p/4k), or whether input_reference is provided — picking 4K costs the same as 720p
In async mode, failed generations / content-policy rejections / capacity errors are all not billed
Top-up bonus tiers under Top-Up Promotions further reduce effective cost
4K renders 4–6× slower and produces files ~10× larger — default to 1080p for daily use
Google’s official 4K rate is $0.30/sec (fast) / $0.60/sec (standard), i.e. $2.40 / $4.80 for 8 sec (source: ai.google.dev/gemini-api/docs/pricing)

Group Setup

VEO 3.1 Official works on the Default group (1x), no dedicated group switching required. The Token’s billing mode must be Pay-per-request or Pay-as-you-go Priority — pure Pay-as-you-go is not supported (switch the Token mode in the console if needed).

Onboarding friction comparison: Versus Sora 2 Official (which requires the dedicated Sora2Official group + Pay-as-you-go Priority only), VEO 3.1 Official runs on the Default group and accepts both Pay-per-request and Pay-as-you-go Priority — ideal when you want “drop in your existing Pay-per-request Token + change base_url” zero-config onboarding.

Dimension	VEO 3.1 Official	Notes
Group	`Default` (1x)	No switching required
Billing model	Pay-per-request ✅ / Pay-as-you-go Priority ✅ / pure Pay-as-you-go ❌	Pure Pay-as-you-go must be switched
Token requirement	Pay-per-request or Pay-as-you-go Priority + Default group	No specialized Token needed
Rate / Multiplier	1.0x	Direct settlement at the prices above

Technical Specs

Dimension	`veo-3.1-fast-generate-preview`	`veo-3.1-generate-preview`
Price	$0.3 / request	$1.2 / request
Supported duration (string)	`"4"` / `"6"` / `"8"`	`"4"` / `"6"` / `"8"`
Supported resolution (`metadata.resolution`)	`720p` / `1080p` / `4k`	`720p` / `1080p` / `4k`
Supported aspect (`metadata.aspectRatio`)	`16:9` / `9:16`	`16:9` / `9:16`
Audio	✅ Synchronized audio + video	✅
Image-to-video (input_reference)	✅ (1 reference image)	✅ (1 reference image)
Typical generation time	720p 60–90s · 1080p 80–120s · 4K 5–6 min	Same
Video retention	Not officially documented — download immediately	Same
Response fields	`id` / `task_id` / `status` / `progress` / `created_at`	Same

At 1080p / 4k resolution, duration must be "8" — "4" or "6" will be rejected upstream. All three durations are supported at 720p.

API Endpoints

Endpoint	Method	Purpose	Content-Type
`/v1/videos`	POST	Submit a video generation task (text-to-video / image-to-video, unified endpoint)	`application/json` or `multipart/form-data`
`/v1/videos/{task_id}`	GET	Query task status and progress	—
`/v1/videos/{task_id}/content`	GET	Download the generated MP4 (binary stream)	—

⚠️ MP4 binary download only — no CDN URL returnedThis channel currently does not output any CDN / public URL in the response — the video file can only be retrieved as an MP4 binary stream via GET /v1/videos/{task_id}/content (requires the Authorization: Bearer header).Implications:

No video_url / data.url / any directly distributable link is returned in the response
Frontends cannot put the endpoint URL directly in a <video> tag — browser requests without the auth header will 401
As soon as status: "completed", download the MP4 and store it in your own OSS / CDN, then serve your URL to end users
Video retention is not officially documented — do not depend long-term on the remote task_id to retrieve videos

Endpoint selection: Primary api.apiyi.com; backup gateways vip.apiyi.com / b.apiyi.com have identical behavior.

Key Parameters

⚡ Full parameter reference: Jump to Text-to-Video - Parameter Reference for the complete table covering model / prompt / duration / size / metadata.* types, defaults, and constraints. This section unpacks only the 3 most pitfall-prone parameters.

`duration` (video length)

Must be a string ("4" / "6" / "8"). Passing a number returns:

parse_request_failed: cannot unmarshal number into Go struct field ... duration of type string

Value	720p	1080p	4k
`"4"`	✅	❌	❌
`"6"`	✅	❌	❌
`"8"`	✅ (default)	✅ (required)	✅ (required)

Parameter precedence: metadata.durationSeconds > duration > seconds > 8

`metadata.resolution` (resolution tier)

Value	Pixels (landscape)	Pixels (portrait)	Notes
`720p` (default)	`1280x720`	`720x1280`	All three durations
`1080p`	`1920x1080`	`1080x1920`	`duration="8"` only
`4k`	`3840x2160`	`2160x3840`	`duration="8"` only, 4–6× slower render

Parameter precedence: metadata.resolution > size > 720p

⚠️ Do not pass `generateAudio`

Veo 3 / 3.1 is natively audio-aware, but the generateAudio parameter must not be passed — upstream will reject with INVALID_ARGUMENT. To control audio, write the intent into your prompt:

“Coastal lighthouse at dusk; waves, distant seabirds, low wind sounds, cinematic atmosphere”

Best Practices

Pick a model by need

Iteration / batch previews → veo-3.1-fast-generate-preview ($0.3/request)
Final delivery / 4K → veo-3.1-generate-preview ($1.2/request)
Run both with the same prompt + seed; pick by eye

Validate with 4 seconds first

For every new prompt, start with duration: "4" to validate camera direction and style (60–90 sec render, $0.3). Scale up to 8 sec or 1080p once the look is locked.

Use async polling, not sync wait

The Official channel is async-only: POST to submit and get task_id → poll GET /v1/videos/{task_id} every 8–10 sec until status: "completed" → download from /content. No webhooks; polling only.

Set client timeouts by tier

720p / 1080p: 3 min hard timeout
4K: 10 min hard timeout
POST submit (multipart): 30 sec minimum

Download immediately on completion

Once status flips to completed, download to your own OSS / CDN immediately — do not depend long-term on the remote task_id. The /content endpoint occasionally returns 400 right after status flips; retry after 4 seconds (the sample clients have this baked in).

Encode audio intent in the prompt

Do not pass generateAudio (it returns INVALID_ARGUMENT). For ambient sound, dialogue, BGM, describe in the prompt: “waves, distant seabirds, low wind sounds”.

Rate-limit on your end

Concurrency caps are not publicly documented; in practice 10 simultaneous submissions all queued successfully. Recommend production-side limit of in-flight ≤ 10, with exponential backoff for 429 / 5xx.

Error Codes & Retries

Status / Symptom	Meaning	Recommended Action
`400` + `parse_request_failed`	`duration` was a number	Use string `"4"` / `"6"` / `"8"`
`INVALID_ARGUMENT`	Passed `generateAudio` or non-8-sec at 1080p/4k	Drop `generateAudio`; set `duration="8"` for HD/4K
`401`	Invalid token	Check `Authorization: Bearer <key>` (no extra whitespace), key still valid
`429`	Rate-limited / insufficient balance	Exponential backoff retry; top up and retry
`5xx` / `INTERNAL`	Upstream transient error	Retry 1–2 times with same seed (not billed)
`GET /content` occasional 400	`status` just flipped to `completed`	Wait 4 sec and retry (clients should retry 3–5 times)
Task `failed`	Generation failed (typically content review or upstream capacity)	Adjust prompt and retry; task is not billed

Recommended client settings:

POST submit timeout: 30 sec (multipart uploads may need more)
Polling interval: 8–10 sec; max wait 720p/1080p 3 min, 4K 10 min
Exponential backoff retry for 5xx and failed (1–2 attempts recommended)
Retry /content 3–5 times with 4-second gaps

FAQ

Official vs Reverse channel — what's the difference? Is the Reverse channel still usable?

Official (this page): Transparent passthrough to Google AI Studio’s upstream endpoints. Model IDs match Google upstream (veo-3.1-generate-preview / veo-3.1-fast-generate-preview), priced at $0.3 / $1.2 per request, async endpoint only.Reverse (existing VEO 3.1): Reverse-engineered access to Google Flow. Model IDs are veo-3.1-fast / veo-3.1 / -fl series, priced from $0.15 per request — cheaper, and supports both streaming sync and async modes, plus frame-to-video (first/last frame).See the full Official vs Reverse decision matrix. Both channels coexist; pick by business need.

Why must duration be a string?

The backend Go struct declares duration as string; passing a number is rejected at the decoder layer with parse_request_failed: cannot unmarshal number into Go struct field ... duration of type string. This is an upstream hard constraint and will not change. Remember to quote: "4" / "6" / "8".

How do I add dialogue / ambient sound / BGM? Can I pass generateAudio?

Veo 3 / 3.1 is a natively audio-enabled video model, but the generateAudio parameter must not be passed (upstream returns INVALID_ARGUMENT). To control sound, write the intent into your prompt:

“Coastal lighthouse at dusk; waves, distant seabirds, low wind sounds, cinematic atmosphere”

fast vs standard — which to pick? Is fast actually faster?

At equivalent parameters, render time is roughly the same (measured 720p 8 sec: fast 83s, standard 78s). fast is not faster — it is cheaper ($0.3 vs $1.2)
Default to veo-3.1-fast-generate-preview
Switch to veo-3.1-generate-preview for final delivery or when detail fidelity / physics consistency matters
A/B in production: run both with same prompt + seed, pick by eye

Is 4K worth using?

Not recommended for most cases:

Same per-request price sounds attractive
But rendering is 4–6× slower (720p 80s → 4K 350s)
Files are ~10× larger (720p 4MB → 4K 40MB) — bandwidth and storage costs double
1080p is visually sufficient for most playback scenarios

When you do need 4K: use veo-3.1-generate-preview, set duration="8" (mandatory), client timeout ≥ 10 min, run as a backgrounded async task.

When is a task done? Are there webhooks?

There are no webhooks; poll GET /v1/videos/{task_id} only
Recommended polling interval: 8 sec (measured sufficient, won’t trip rate limits)
Measured times: 720p / 1080p 60–115 sec, 4K 5–6 min
Client timeouts: 3 min for 720p/1080p, 10 min for 4K

Why does GET /content return 400?

Right after status flips to completed, calling /v1/videos/{task_id}/content occasionally returns 400 due to upstream CDN sync delay. Wait 4 sec and retry once typically resolves it (the sample clients retry 3–5 times with 4-second gaps).

Can I get a CDN URL for the video? Can the frontend hit the endpoint directly?

Not currently. This channel does not return any CDN / public URL in the response — no video_url / data.url / any other directly distributable link.The only way to retrieve the video: after status: "completed", call GET /v1/videos/{task_id}/content to get the MP4 binary stream (requires Authorization: Bearer header).Standard production pattern:

Backend downloads the MP4 as soon as the task completes → push to your own OSS / CDN
Serve your CDN URL to end users
The frontend <video> tag should NOT point directly at /content — browsers can’t include the auth header, requests will 401

If/when the upstream exposes CDN URLs, this page will be updated.

How long are videos kept on the server? Do I need to download immediately?

Retention period is not officially documented. Strongly recommended: download immediately on completion and store locally — do not depend long-term on the remote task_id; /content will eventually 404 after expiry.

Why does progress stay at 50%?

The progress field is coarse-grained — only jumps between 0 / 50 / 100. Do not use it for a percentage progress bar. Use a spinner, or compute “elapsed / expected” yourself.

Are failed generations billed?

No. Only status=completed tasks are charged. failed / canceled / content-policy rejected / parameter error are all free. No real video output, no charge.

Can seed reproduce identical videos?

Not byte-identical. Measured: same prompt + same seed (88888) + same params, fast run twice — file sizes 9.81 MB vs 9.25 MB, md5 completely different, render times also differ.But seed is not decorative: same-seed outputs cluster together (5-run test, intra-group file size spread only 6%), different seeds shift systematically (inter-group spread +36.8%). Implications:

Want a “stable look” → fix the seed
Want to “explore variations” → swap seed instead of fiddling with prompt
Want “exact replay” → forget it, store the mp4

Can I pass multiple reference images? First/last frame?

Neither is currently supported. Image-to-video accepts only 1 image, with the field name fixed as input_reference, and only as file or Base64, not remote URL.Google upstream Veo 3.1 supports multi-reference / first-last-frame / video extension, but this channel does not. For first/last frame, use the VEO 3.1 (Reverse) -fl series.

Concurrency limits? QPS caps?

Measured 10 simultaneous submissions, all queued successfully — no rejection. The exact cap is not publicly stated. Recommend production-side limit of in-flight ≤ 10, with exponential backoff on 429 / 5xx.

Do videos carry watermarks or provenance metadata?

No visible watermark
But they carry Google C2PA Content Credentials (issued by Google C2PA Media Services, format urn:c2pa:...) embedded in MP4 metadata. End users cannot see them; C2PA tools (e.g., Adobe Content Authenticity) can verify “generated by Veo”
For redistribution scenarios, be aware; usually does not affect playback

Can I use the official OpenAI SDK directly?

Partially. The interface follows OpenAI conventions (Bearer auth + /v1/...), but the OpenAI official SDK does not expose a videos.create method (/v1/videos is a custom path). Use the OpenAI SDK’s low-level client.post() or raw HTTP. Raw HTTP is simplest — see code samples in Text-to-Video Playground.

Text-to-Video Playground — POST /v1/videos (JSON) interactive debugger + 5-language code samples
Image-to-Video Playground — POST /v1/videos (multipart) + input_reference usage
Official vs Reverse decision matrix — Differences vs VEO 3.1 (Reverse)
Top-Up Promotions — Bonus tiers and eligible channels
API Manual — Generic call conventions, timeout and retry guidance
Google official model page: ai.google.dev/gemini-api/docs/models/veo-3.1-generate-preview
Google video generation docs: ai.google.dev/gemini-api/docs/video

VEO 3.1 Official is APIYI’s stable official-relay service — a transparent passthrough to Google AI Studio. Model IDs, response fields, and constraints match Google upstream exactly, and the channel works on the Default group with pay-per-request billing — the lowest-friction official-quality channel available. Please file feedback in the console support panel.

Basics

Basic API

Image API

Video API

Multimodal Understanding API

Text API

VEO 3.1 Official Video Generation

Overview

Text-to-Video API

Image-to-Video API

Official vs Reverse

Why APIYI’s VEO 3.1 Official?

Official Passthrough · Identical Model IDs

Zero-Friction Onboarding · No Group Switching

Unlimited Concurrency · Production Scale

Per-Request Pricing · 60%+ Cheaper than Google

Global Zero-Friction Access

Professional Support · Enterprise Onboarding

Key Features

Native Synchronized Audio

Flexible 4 / 6 / 8 Second Durations

Three Resolution Tiers

Precise Instruction Following

Image-to-Video (input_reference)

Async Task Model

OpenAI-Compatible Protocol

Failures Are Free

Pricing

Group Setup

Technical Specs

API Endpoints

Key Parameters

`duration` (video length)

`metadata.resolution` (resolution tier)

⚠️ Do not pass `generateAudio`

Best Practices

Error Codes & Retries

FAQ

Basics

Basic API

Image API

Video API

Multimodal Understanding API

Text API

Documentation Index

​Overview

Text-to-Video API

Image-to-Video API

Official vs Reverse

​Why APIYI’s VEO 3.1 Official?

Official Passthrough · Identical Model IDs

Zero-Friction Onboarding · No Group Switching

Unlimited Concurrency · Production Scale

Per-Request Pricing · 60%+ Cheaper than Google

Global Zero-Friction Access

Professional Support · Enterprise Onboarding

​Key Features

Native Synchronized Audio

Flexible 4 / 6 / 8 Second Durations

Three Resolution Tiers

Precise Instruction Following

Image-to-Video (input_reference)

Async Task Model

OpenAI-Compatible Protocol

Failures Are Free

​Pricing

​Group Setup

​Technical Specs

​API Endpoints

​Key Parameters

​duration (video length)

​metadata.resolution (resolution tier)

​⚠️ Do not pass generateAudio

​Best Practices

​Error Codes & Retries

​FAQ

​Related Docs

Overview

Why APIYI’s VEO 3.1 Official?

Key Features

Pricing

Group Setup

Technical Specs

API Endpoints

Key Parameters

`duration` (video length)

`metadata.resolution` (resolution tier)

⚠️ Do not pass `generateAudio`

Best Practices

Error Codes & Retries

FAQ

Related Docs