Skip to main content
POST
/
v1
/
videos
Image-to-video: submit a generation task from a reference image
curl --request POST \
  --url https://api.apiyi.com/v1/videos \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form model=veo-3.1-fast-generate-preview \
  --form 'prompt=Camera slowly rises from the base of the lighthouse to the top, dusk lighting, waves lapping the rocks' \
  --form input_reference='@example-file'
{
  "id": "task_xxxxxxxxxxxxxxxx",
  "task_id": "task_xxxxxxxxxxxxxxxx",
  "object": "video",
  "model": "veo-3.1-fast-generate-preview",
  "status": "queued",
  "progress": 0,
  "created_at": 1775025000,
  "completed_at": 1775025090
}

Documentation Index

Fetch the complete documentation index at: https://docs.apiyi.com/llms.txt

Use this file to discover all available pages before exploring further.

The interactive Playground on the right supports live debugging. Fill in your API Key under Authorization (format Bearer sk-xxx), upload a reference image, enter prompt, pick model / duration / resolution, then send. Default group works — no dedicated group switch needed.
Scope: This page covers “generate video from a reference image” — upload one image as the visual anchor / starting frame to animate static content. If you don’t need a reference image, use the Text-to-Video endpoint (same endpoint, JSON body).
⚠️ Image-to-video constraints
  • Content-Type must be multipart/form-data (not JSON)
  • Only 1 reference image supported; field name is fixed as input_reference. Multi-image submissions only keep the first
  • Remote URLs not accepted — must be a file upload or Base64
  • Accepted formats: image/jpeg / image/png / image/webp
  • duration must still be a string "4" / "6" / "8"; passing a number fails
  • At 1080p / 4k, duration must be "8"
Google upstream Veo 3.1 supports multi-reference / first-last-frame / video extension; this Official channel does not yet expose them. For first/last frame, use the VEO 3.1 (Reverse) -fl series.

Code Samples

Python (OpenAI SDK · low-level client.post)

{/* OpenAI SDK has no videos.create method; /v1/videos is a custom path, use low-level client.post() */}
from openai import OpenAI
import time

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://api.apiyi.com/v1"
)

# Step 1: multipart upload (low-level client.post handles multipart boundary)
with open("./lighthouse.png", "rb") as f:
    resp = client.post(
        "/videos",
        body=None,
        files={
            "input_reference": ("lighthouse.png", f, "image/png")
        },
        extra_body={
            "model": "veo-3.1-fast-generate-preview",
            "prompt": "Camera slowly rises from the base of the lighthouse to the top, dusk lighting, ocean ambience",
            "duration": "8",  # must be string
            "size": "1280x720",
            "resolution": "720p",
            "aspectRatio": "16:9"
        },
        cast_to=dict
    )
task_id = resp["task_id"]
print(f"Task ID: {task_id}, status: {resp['status']}")

# Step 2: poll (up to 3 minutes)
deadline = time.time() + 180
while time.time() < deadline:
    s = client.get(f"/videos/{task_id}", cast_to=dict)
    print(f"Status: {s['status']}, progress: {s.get('progress', 0)}%")
    if s["status"] == "completed":
        break
    if s["status"] == "failed":
        raise RuntimeError(f"Generation failed: {s}")
    time.sleep(8)

# Step 3: download (with retry)
import urllib.request, urllib.error
time.sleep(4)
for i in range(5):
    try:
        req = urllib.request.Request(
            f"https://api.apiyi.com/v1/videos/{task_id}/content",
            headers={"Authorization": "Bearer sk-your-api-key"}
        )
        with urllib.request.urlopen(req, timeout=180) as r, open("output.mp4", "wb") as f:
            while chunk := r.read(1 << 16):
                f.write(chunk)
        break
    except urllib.error.HTTPError:
        if i == 4:
            raise
        time.sleep(4)
print("Saved: output.mp4")

Python (requests + multipart)

import requests
import time

API_KEY = "sk-your-api-key"
BASE_URL = "https://api.apiyi.com/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# Step 1: multipart upload of reference image + form fields
with open("./lighthouse.png", "rb") as f:
    resp = requests.post(
        f"{BASE_URL}/videos",
        headers=HEADERS,  # do not set Content-Type manually; requests handles multipart boundary
        data={
            "model": "veo-3.1-fast-generate-preview",
            "prompt": "Camera slowly rises from the base of the lighthouse to the top, dusk lighting, waves lapping the rocks",
            "duration": "8",  # string, not number
            "size": "1280x720",
            "resolution": "720p",
            "aspectRatio": "16:9",
            "seed": "20260521"
        },
        files={
            "input_reference": ("lighthouse.png", f, "image/png")
        },
        timeout=60  # multipart upload of large images may be slow
    ).json()
task_id = resp["task_id"]
print(f"Task ID: {task_id}")

# Step 2: poll
deadline = time.time() + 180
while time.time() < deadline:
    s = requests.get(f"{BASE_URL}/videos/{task_id}", headers=HEADERS).json()
    print(f"Status: {s['status']}, progress: {s.get('progress', 0)}%")
    if s["status"] == "completed":
        break
    if s["status"] == "failed":
        raise RuntimeError(s)
    time.sleep(8)

# Step 3: download (with 3 retries)
time.sleep(4)
for i in range(5):
    try:
        with requests.get(
            f"{BASE_URL}/videos/{task_id}/content",
            headers=HEADERS, stream=True, timeout=180
        ) as r:
            r.raise_for_status()
            with open("output.mp4", "wb") as f:
                for chunk in r.iter_content(chunk_size=8192):
                    f.write(chunk)
        break
    except requests.HTTPError:
        if i == 4:
            raise
        time.sleep(4)
print("Saved: output.mp4")

cURL (multipart upload)

{/* Step 1: multipart upload (input_reference uses @ to reference a local file) */}
RESP=$(curl -sS -X POST "https://api.apiyi.com/v1/videos" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "model=veo-3.1-fast-generate-preview" \
  -F "prompt=Camera slowly rises from the base of the lighthouse to the top, dusk lighting" \
  -F "duration=8" \
  -F "size=1280x720" \
  -F "resolution=720p" \
  -F "aspectRatio=16:9" \
  -F "input_reference=@./lighthouse.png;type=image/png")
TASK_ID=$(echo "$RESP" | python3 -c 'import sys,json;print(json.load(sys.stdin)["task_id"])')
echo "task_id=$TASK_ID"

{/* Step 2: poll */}
while :; do
  S=$(curl -sS -H "Authorization: Bearer sk-your-api-key" "https://api.apiyi.com/v1/videos/$TASK_ID")
  ST=$(echo "$S" | python3 -c 'import sys,json;print(json.load(sys.stdin)["status"])')
  echo "status=$ST"
  [ "$ST" = "completed" ] && break
  [ "$ST" = "failed" ] && { echo "$S"; exit 1; }
  sleep 8
done

{/* Step 3: download (--retry covers occasional 400 right after status=completed) */}
sleep 4
curl -sSL --retry 3 --retry-delay 4 \
  -H "Authorization: Bearer sk-your-api-key" \
  "https://api.apiyi.com/v1/videos/$TASK_ID/content" \
  -o output.mp4
ls -lh output.mp4

Node.js (native fetch + FormData)

import fs from 'node:fs';
import { FormData, File } from 'undici';

const API_KEY = 'sk-your-api-key';
const BASE_URL = 'https://api.apiyi.com/v1';

// Step 1: multipart upload
const buffer = fs.readFileSync('./lighthouse.png');
const form = new FormData();
form.append('model', 'veo-3.1-fast-generate-preview');
form.append('prompt', 'Camera slowly rises from the base of the lighthouse to the top, dusk lighting');
form.append('duration', '8');  // string
form.append('size', '1280x720');
form.append('resolution', '720p');
form.append('aspectRatio', '16:9');
form.append('input_reference', new File([buffer], 'lighthouse.png', { type: 'image/png' }));

const submitResp = await fetch(`${BASE_URL}/videos`, {
    method: 'POST',
    headers: { 'Authorization': `Bearer ${API_KEY}` },  // do not set Content-Type manually
    body: form
});
const { task_id } = await submitResp.json();
console.log(`Task ID: ${task_id}`);

// Step 2: poll
let status = 'queued';
while (status !== 'completed' && status !== 'failed') {
    await new Promise(r => setTimeout(r, 8000));
    const s = await (await fetch(`${BASE_URL}/videos/${task_id}`, {
        headers: { 'Authorization': `Bearer ${API_KEY}` }
    })).json();
    status = s.status;
    console.log(`Status: ${status}, progress: ${s.progress ?? 0}%`);
}

if (status === 'failed') throw new Error('Generation failed');

// Step 3: download (with retry)
await new Promise(r => setTimeout(r, 4000));
let videoBuffer;
for (let i = 0; i < 4; i++) {
    try {
        const resp = await fetch(`${BASE_URL}/videos/${task_id}/content`, {
            headers: { 'Authorization': `Bearer ${API_KEY}` }
        });
        if (!resp.ok) throw new Error(`HTTP ${resp.status}`);
        videoBuffer = Buffer.from(await resp.arrayBuffer());
        break;
    } catch (e) {
        if (i === 3) throw e;
        await new Promise(r => setTimeout(r, 4000));
    }
}
fs.writeFileSync('output.mp4', videoBuffer);
console.log('Saved: output.mp4');

Browser JavaScript (file input upload)

{/* Demo only; route through a backend proxy in production to avoid Key leakage */}
const fileInput = document.querySelector('input[type=file]');
const file = fileInput.files[0];

const form = new FormData();
form.append('model', 'veo-3.1-fast-generate-preview');
form.append('prompt', 'Animate this scene with a gentle camera push-in and natural ambient sound');
form.append('duration', '4');
form.append('size', '720x1280');
form.append('resolution', '720p');
form.append('aspectRatio', '9:16');
form.append('input_reference', file);

const submitResp = await fetch('https://api.apiyi.com/v1/videos', {
    method: 'POST',
    headers: { 'Authorization': 'Bearer sk-your-api-key' },
    body: form
});
const { task_id } = await submitResp.json();
console.log('Task ID:', task_id);

{/* After polling completes, route the /content endpoint through your backend proxy */}

Parameter Reference

ParamTypeRequiredDefaultDescription
input_referencefileYesReference image file. Field name is fixed, only 1 image, accepts image/jpeg / image/png / image/webp. Remote URLs not accepted
modelstringYesveo-3.1-fast-generate-preview ($0.3/req) or veo-3.1-generate-preview ($1.2/req)
promptstringYesDescribe how the scene should animate: camera motion, object action, lighting, style. Do not pass generateAudio, audio intent goes in the prompt
durationstringNo"8""4" / "6" / "8". 1080p/4k must be "8"
sizestringNo1280x720Output pixels
resolutionstringNo720p720p / 1080p / 4k
aspectRatiostringNo16:916:9 (landscape) / 9:16 (portrait)
seedintNoRandom seed (multipart form field; string-encoded number is fine)
Field naming difference vs JSON mode:
  • JSON mode nests under metadata.* (e.g. metadata.resolution)
  • Multipart mode flattens them as form fields (resolution / aspectRatio / seed directly)
  • The code samples above already use multipart conventions
Common mistakes:
  • Sending input_reference as a Base64 string inside a JSON body — must use multipart file field
  • Naming the field image / reference / input_image — must be exactly input_reference
  • Sending 2 images — server only keeps the first, the second is silently dropped
  • Sending a remote URL (https://cdn.../img.png) — not accepted; must be file or Base64

Response Format

The response structure is identical to Text-to-Video: Step 1 returns task_id + status: "queued", Step 2 polling returns status + coarse progress, Step 3 downloads MP4 binary from /content.
{
  "id": "task_xxxxxxxxxxxxxxxx",
  "task_id": "task_xxxxxxxxxxxxxxxx",
  "object": "video",
  "model": "veo-3.1-fast-generate-preview",
  "status": "queued",
  "progress": 0,
  "created_at": 1775025000
}
⚠️ Response field gotchas
  • task_id matches id; downstream should standardize on task_id
  • No video_url field; download from GET /v1/videos/{task_id}/content
  • progress jumps only between 0 / 50 / 100, not linear
  • /content returns 400 occasionally right after status flips to completed; retry after 4 sec
  • Image-to-video tasks typically take 10–30% longer than equivalent text-to-video tasks (extra image encoding step)
This endpoint is an async task entry. Billing happens when the task reaches completed, charged per request by model name (independent of whether input_reference is provided; fast $0.3 / standard $1.2). POST submission, polling, and download themselves are not billed; failed tasks are also not billed.

Authorizations

Authorization
string
header
required

API Key from APIYI console (Default group + Pay-per-request or Pay-as-you-go Priority Token; pure Pay-as-you-go not supported)

Body

multipart/form-data
model
enum<string>
default:veo-3.1-fast-generate-preview
required

Model ID (per-request billing):

  • veo-3.1-fast-generate-preview — $0.3/request
  • veo-3.1-generate-preview — $1.2/request
Available options:
veo-3.1-fast-generate-preview,
veo-3.1-generate-preview
prompt
string
required

Video generation prompt. Focus on how the scene should animate: camera motion, object action, lighting, audio atmosphere. Do not pass generateAudio — audio intent goes in the prompt.

Example:

"Camera slowly rises from the base of the lighthouse to the top, dusk lighting, waves lapping the rocks"

input_reference
file
required

Reference image file. Field name is fixed as input_reference, only 1 image supported.

Accepted formats: image/jpeg / image/png / image/webp. Remote URLs not accepted — must be a file upload or Base64.

duration
enum<string>
default:8

Video length, string enum: "4" / "6" / "8". Must be "8" at 1080p / 4k.

Available options:
4,
6,
8
size
enum<string>
default:1280x720

Output pixel dimensions; lower precedence than resolution

Available options:
1280x720,
720x1280,
1920x1080,
1080x1920,
3840x2160,
2160x3840
resolution
enum<string>
default:720p

Resolution tier (multipart mode flattens this as a top-level form field; higher precedence than size)

Available options:
720p,
1080p,
4k
aspectRatio
enum<string>
default:16:9

Aspect ratio: 16:9 landscape (default) or 9:16 portrait

Available options:
16:9,
9:16
seed
string

Random seed (multipart form field; string-encoded number is fine). Fixed seed clusters outputs in style but does not byte-reproduce.

Example:

"20260521"

negativePrompt
string

Negative prompt; recommended "blurry, watermark, distorted, low quality"

Example:

"blurry, watermark, distorted, low quality"

Response

Task submitted; returns task_id and queued status

id
string

Task ID (matches task_id; downstream should standardize on task_id)

Example:

"task_xxxxxxxxxxxxxxxx"

task_id
string

Task ID for subsequent polling and download

Example:

"task_xxxxxxxxxxxxxxxx"

object
string

Object type, fixed to video

Example:

"video"

model
string

Model ID used for this task

Example:

"veo-3.1-fast-generate-preview"

status
enum<string>

Task status:

  • queued — submitted, awaiting processing
  • in_progress — generating
  • completed — done, downloadable (/v1/videos/{task_id}/content)
  • failed — failed (not billed), retry possible
Available options:
queued,
in_progress,
completed,
failed
Example:

"queued"

progress
integer

Generation progress (coarse-grained, jumps only between 0 / 50 / 100)

Example:

0

created_at
integer

Task creation Unix timestamp (seconds)

Example:

1775025000

completed_at
integer

Task completion Unix timestamp (seconds); only present for completed status

Example:

1775025090