VEO 3.1 Official Image-to-Video API

The interactive Playground on the right supports live debugging. Fill in your API Key under Authorization (format Bearer sk-xxx), upload a reference image, enter prompt, pick model / duration / resolution, then send. Default group works — no dedicated group switch needed.

Scope: This page covers “generate video from a reference image” — upload one image as the visual anchor / starting frame to animate static content. If you don’t need a reference image, use the Text-to-Video endpoint (same endpoint, JSON body).

⚠️ Image-to-video constraints

Content-Type must be multipart/form-data (not JSON)
Only 1 reference image supported; field name is fixed as input_reference. Multi-image submissions only keep the first
Remote URLs not accepted — must be a file upload or Base64
Accepted formats: image/jpeg / image/png / image/webp
duration must still be a string "4" / "6" / "8"; passing a number fails
At 1080p / 4k, duration must be "8"

Google upstream Veo 3.1 supports multi-reference / first-last-frame / video extension; this Official channel does not yet expose them. For first/last frame, use the VEO 3.1 (Reverse) -fl series.

Code Samples

Python (OpenAI SDK · low-level client.post)

{/* OpenAI SDK has no videos.create method; /v1/videos is a custom path, use low-level client.post() */}
from openai import OpenAI
import time

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://api.apiyi.com/v1"
)

# Step 1: multipart upload (low-level client.post handles multipart boundary)
with open("./lighthouse.png", "rb") as f:
    resp = client.post(
        "/videos",
        body=None,
        files={
            "input_reference": ("lighthouse.png", f, "image/png")
        },
        extra_body={
            "model": "veo-3.1-fast-generate-preview",
            "prompt": "Camera slowly rises from the base of the lighthouse to the top, dusk lighting, ocean ambience",
            "duration": "8",  # must be string
            "size": "1280x720",
            "resolution": "720p",
            "aspectRatio": "16:9"
        },
        cast_to=dict
    )
task_id = resp["task_id"]
print(f"Task ID: {task_id}, status: {resp['status']}")

# Step 2: poll (up to 3 minutes)
deadline = time.time() + 180
while time.time() < deadline:
    s = client.get(f"/videos/{task_id}", cast_to=dict)
    print(f"Status: {s['status']}, progress: {s.get('progress', 0)}%")
    if s["status"] == "completed":
        break
    if s["status"] == "failed":
        raise RuntimeError(f"Generation failed: {s}")
    time.sleep(8)

# Step 3: download (with retry)
import urllib.request, urllib.error
time.sleep(4)
for i in range(5):
    try:
        req = urllib.request.Request(
            f"https://api.apiyi.com/v1/videos/{task_id}/content",
            headers={"Authorization": "Bearer sk-your-api-key"}
        )
        with urllib.request.urlopen(req, timeout=180) as r, open("output.mp4", "wb") as f:
            while chunk := r.read(1 << 16):
                f.write(chunk)
        break
    except urllib.error.HTTPError:
        if i == 4:
            raise
        time.sleep(4)
print("Saved: output.mp4")

Python (requests + multipart)

import requests
import time

API_KEY = "sk-your-api-key"
BASE_URL = "https://api.apiyi.com/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# Step 1: multipart upload of reference image + form fields
with open("./lighthouse.png", "rb") as f:
    resp = requests.post(
        f"{BASE_URL}/videos",
        headers=HEADERS,  # do not set Content-Type manually; requests handles multipart boundary
        data={
            "model": "veo-3.1-fast-generate-preview",
            "prompt": "Camera slowly rises from the base of the lighthouse to the top, dusk lighting, waves lapping the rocks",
            "duration": "8",  # string, not number
            "size": "1280x720",
            "resolution": "720p",
            "aspectRatio": "16:9",
            "seed": "20260521"
        },
        files={
            "input_reference": ("lighthouse.png", f, "image/png")
        },
        timeout=60  # multipart upload of large images may be slow
    ).json()
task_id = resp["task_id"]
print(f"Task ID: {task_id}")

# Step 2: poll
deadline = time.time() + 180
while time.time() < deadline:
    s = requests.get(f"{BASE_URL}/videos/{task_id}", headers=HEADERS).json()
    print(f"Status: {s['status']}, progress: {s.get('progress', 0)}%")
    if s["status"] == "completed":
        break
    if s["status"] == "failed":
        raise RuntimeError(s)
    time.sleep(8)

# Step 3: download (with 3 retries)
time.sleep(4)
for i in range(5):
    try:
        with requests.get(
            f"{BASE_URL}/videos/{task_id}/content",
            headers=HEADERS, stream=True, timeout=180
        ) as r:
            r.raise_for_status()
            with open("output.mp4", "wb") as f:
                for chunk in r.iter_content(chunk_size=8192):
                    f.write(chunk)
        break
    except requests.HTTPError:
        if i == 4:
            raise
        time.sleep(4)
print("Saved: output.mp4")

cURL (multipart upload)

{/* Step 1: multipart upload (input_reference uses @ to reference a local file) */}
RESP=$(curl -sS -X POST "https://api.apiyi.com/v1/videos" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "model=veo-3.1-fast-generate-preview" \
  -F "prompt=Camera slowly rises from the base of the lighthouse to the top, dusk lighting" \
  -F "duration=8" \
  -F "size=1280x720" \
  -F "resolution=720p" \
  -F "aspectRatio=16:9" \
  -F "input_reference=@./lighthouse.png;type=image/png")
TASK_ID=$(echo "$RESP" | python3 -c 'import sys,json;print(json.load(sys.stdin)["task_id"])')
echo "task_id=$TASK_ID"

{/* Step 2: poll */}
while :; do
  S=$(curl -sS -H "Authorization: Bearer sk-your-api-key" "https://api.apiyi.com/v1/videos/$TASK_ID")
  ST=$(echo "$S" | python3 -c 'import sys,json;print(json.load(sys.stdin)["status"])')
  echo "status=$ST"
  [ "$ST" = "completed" ] && break
  [ "$ST" = "failed" ] && { echo "$S"; exit 1; }
  sleep 8
done

{/* Step 3: download (--retry covers occasional 400 right after status=completed) */}
sleep 4
curl -sSL --retry 3 --retry-delay 4 \
  -H "Authorization: Bearer sk-your-api-key" \
  "https://api.apiyi.com/v1/videos/$TASK_ID/content" \
  -o output.mp4
ls -lh output.mp4

Node.js (native fetch + FormData)

import fs from 'node:fs';
import { FormData, File } from 'undici';

const API_KEY = 'sk-your-api-key';
const BASE_URL = 'https://api.apiyi.com/v1';

// Step 1: multipart upload
const buffer = fs.readFileSync('./lighthouse.png');
const form = new FormData();
form.append('model', 'veo-3.1-fast-generate-preview');
form.append('prompt', 'Camera slowly rises from the base of the lighthouse to the top, dusk lighting');
form.append('duration', '8');  // string
form.append('size', '1280x720');
form.append('resolution', '720p');
form.append('aspectRatio', '16:9');
form.append('input_reference', new File([buffer], 'lighthouse.png', { type: 'image/png' }));

const submitResp = await fetch(`${BASE_URL}/videos`, {
    method: 'POST',
    headers: { 'Authorization': `Bearer ${API_KEY}` },  // do not set Content-Type manually
    body: form
});
const { task_id } = await submitResp.json();
console.log(`Task ID: ${task_id}`);

// Step 2: poll
let status = 'queued';
while (status !== 'completed' && status !== 'failed') {
    await new Promise(r => setTimeout(r, 8000));
    const s = await (await fetch(`${BASE_URL}/videos/${task_id}`, {
        headers: { 'Authorization': `Bearer ${API_KEY}` }
    })).json();
    status = s.status;
    console.log(`Status: ${status}, progress: ${s.progress ?? 0}%`);
}

if (status === 'failed') throw new Error('Generation failed');

// Step 3: download (with retry)
await new Promise(r => setTimeout(r, 4000));
let videoBuffer;
for (let i = 0; i < 4; i++) {
    try {
        const resp = await fetch(`${BASE_URL}/videos/${task_id}/content`, {
            headers: { 'Authorization': `Bearer ${API_KEY}` }
        });
        if (!resp.ok) throw new Error(`HTTP ${resp.status}`);
        videoBuffer = Buffer.from(await resp.arrayBuffer());
        break;
    } catch (e) {
        if (i === 3) throw e;
        await new Promise(r => setTimeout(r, 4000));
    }
}
fs.writeFileSync('output.mp4', videoBuffer);
console.log('Saved: output.mp4');

Browser JavaScript (file input upload)

{/* Demo only; route through a backend proxy in production to avoid Key leakage */}
const fileInput = document.querySelector('input[type=file]');
const file = fileInput.files[0];

const form = new FormData();
form.append('model', 'veo-3.1-fast-generate-preview');
form.append('prompt', 'Animate this scene with a gentle camera push-in and natural ambient sound');
form.append('duration', '4');
form.append('size', '720x1280');
form.append('resolution', '720p');
form.append('aspectRatio', '9:16');
form.append('input_reference', file);

const submitResp = await fetch('https://api.apiyi.com/v1/videos', {
    method: 'POST',
    headers: { 'Authorization': 'Bearer sk-your-api-key' },
    body: form
});
const { task_id } = await submitResp.json();
console.log('Task ID:', task_id);

{/* After polling completes, route the /content endpoint through your backend proxy */}

Parameter Reference

Param	Type	Required	Default	Description
`input_reference`	file	Yes	—	Reference image file. Field name is fixed, only 1 image, accepts `image/jpeg` / `image/png` / `image/webp`. Remote URLs not accepted
`model`	string	Yes	—	`veo-3.1-fast-generate-preview` ($0.3/req) or `veo-3.1-generate-preview` ($1.2/req)
`prompt`	string	Yes	—	Describe how the scene should animate: camera motion, object action, lighting, style. Do not pass `generateAudio`, audio intent goes in the prompt
`duration`	string	No	`"8"`	`"4"` / `"6"` / `"8"`. 1080p/4k must be `"8"`
`size`	string	No	`1280x720`	Output pixels
`resolution`	string	No	`720p`	`720p` / `1080p` / `4k`
`aspectRatio`	string	No	`16:9`	`16:9` (landscape) / `9:16` (portrait)
`seed`	int	No	—	Random seed (multipart form field; string-encoded number is fine)

Field naming difference vs JSON mode:

JSON mode nests under metadata.* (e.g. metadata.resolution)
Multipart mode flattens them as form fields (resolution / aspectRatio / seed directly)
The code samples above already use multipart conventions

Common mistakes:

Sending input_reference as a Base64 string inside a JSON body — must use multipart file field
Naming the field image / reference / input_image — must be exactly input_reference
Sending 2 images — server only keeps the first, the second is silently dropped
Sending a remote URL (https://cdn.../img.png) — not accepted; must be file or Base64

Response Format

The response structure is identical to Text-to-Video: Step 1 returns task_id + status: "queued", Step 2 polling returns status + coarse progress, Step 3 downloads MP4 binary from /content.

{
  "id": "task_xxxxxxxxxxxxxxxx",
  "task_id": "task_xxxxxxxxxxxxxxxx",
  "object": "video",
  "model": "veo-3.1-fast-generate-preview",
  "status": "queued",
  "progress": 0,
  "created_at": 1775025000
}

⚠️ Response field gotchas

task_id matches id; downstream should standardize on task_id
No video_url field; download from GET /v1/videos/{task_id}/content
progress jumps only between 0 / 50 / 100, not linear
/content returns 400 occasionally right after status flips to completed; retry after 4 sec
Image-to-video tasks typically take 10–30% longer than equivalent text-to-video tasks (extra image encoding step)

This endpoint is an async task entry. Billing happens when the task reaches completed, charged per request by model name (independent of whether input_reference is provided; fast $0.3 / standard $1.2). POST submission, polling, and download themselves are not billed; failed tasks are also not billed.

Authorizations

Authorization

string

header

required

API Key from APIYI console (Default group + Pay-per-request or Pay-as-you-go Priority Token; pure Pay-as-you-go not supported)

Body

multipart/form-data

model

enum<string>

default:veo-3.1-fast-generate-preview

required

Model ID (per-request billing):

veo-3.1-fast-generate-preview — $0.3/request
veo-3.1-generate-preview — $1.2/request

Available options:

veo-3.1-fast-generate-preview,

veo-3.1-generate-preview

prompt

string

required

Video generation prompt. Focus on how the scene should animate: camera motion, object action, lighting, audio atmosphere. Do not pass generateAudio — audio intent goes in the prompt.

Example:

"Camera slowly rises from the base of the lighthouse to the top, dusk lighting, waves lapping the rocks"

input_reference

file

required

Reference image file. Field name is fixed as input_reference, only 1 image supported.

Accepted formats: image/jpeg / image/png / image/webp. Remote URLs not accepted — must be a file upload or Base64.

duration

enum<string>

default:8

Video length, string enum: "4" / "6" / "8". Must be "8" at 1080p / 4k.

Available options:

4,

6,

8

size

enum<string>

default:1280x720

Output pixel dimensions; lower precedence than resolution

Available options:

1280x720,

720x1280,

1920x1080,

1080x1920,

3840x2160,

2160x3840

resolution

enum<string>

default:720p

Resolution tier (multipart mode flattens this as a top-level form field; higher precedence than size)

Available options:

720p,

1080p,

4k

aspectRatio

enum<string>

default:16:9

Aspect ratio: 16:9 landscape (default) or 9:16 portrait

Available options:

16:9,

9:16

seed

string

Random seed (multipart form field; string-encoded number is fine). Fixed seed clusters outputs in style but does not byte-reproduce.

Example:

"20260521"

negativePrompt

string

Negative prompt; recommended "blurry, watermark, distorted, low quality"

Example:

"blurry, watermark, distorted, low quality"

Response

Task submitted; returns task_id and queued status

string

Task ID (matches task_id; downstream should standardize on task_id)

Example:

"task_xxxxxxxxxxxxxxxx"

task_id

string

Task ID for subsequent polling and download

Example:

"task_xxxxxxxxxxxxxxxx"

object

string

Object type, fixed to video

Example:

"video"

model

string

Model ID used for this task

Example:

"veo-3.1-fast-generate-preview"

status

enum<string>

Task status:

queued — submitted, awaiting processing
in_progress — generating
completed — done, downloadable (/v1/videos/{task_id}/content)
failed — failed (not billed), retry possible

Available options:

queued,

in_progress,

completed,

failed

Example:

"queued"

progress

integer

Generation progress (coarse-grained, jumps only between 0 / 50 / 100)

Example:

0

created_at

integer

Task creation Unix timestamp (seconds)

Example:

1775025000

completed_at

integer

Task completion Unix timestamp (seconds); only present for completed status

Example:

1775025090

Basics

Basic API

Image API

Video API

Multimodal Understanding API

Text API

VEO 3.1 Official Image-to-Video API

Code Samples

Python (OpenAI SDK · low-level client.post)

Python (requests + multipart)

cURL (multipart upload)

Node.js (native fetch + FormData)

Browser JavaScript (file input upload)

Parameter Reference

Response Format

Authorizations

Body

Response

Basics

Basic API

Image API

Video API

Multimodal Understanding API

Text API

Documentation Index

​Code Samples

​Python (OpenAI SDK · low-level client.post)

​Python (requests + multipart)

​cURL (multipart upload)

​Node.js (native fetch + FormData)

​Browser JavaScript (file input upload)

​Parameter Reference

​Response Format

Authorizations

Body

Response

Code Samples

Python (OpenAI SDK · low-level client.post)

Python (requests + multipart)

cURL (multipart upload)

Node.js (native fetch + FormData)

Browser JavaScript (file input upload)

Parameter Reference

Response Format