Image-to-Video API Reference

The interactive Playground on the right supports live debugging. Set your API Key in the Authorization field (format: Bearer sk-xxx), upload a reference image, enter a prompt, choose model / size / seconds, and send.

Scope: This page covers “generate video from a reference image” — upload one image as the starting frame / visual anchor to animate static visuals. If you don’t need a reference image, use the Text-to-Video endpoint (same path, JSON body).

⚠️ Reference image dimensions must exactly match size

The uploaded image’s pixel dimensions must equal the size field (e.g. size=1280x720 requires a 1280×720 image)
Mismatch returns 400: Inpaint image must match the requested width and height
Pre-crop with ffmpeg / Pillow before upload

Other notes:

Content-Type must be multipart/form-data (not JSON)
Only one file is supported; the field name is fixed as input_reference
Accepted formats: image/jpeg / image/png / image/webp

Code Samples

Python (OpenAI SDK Drop-In)

from openai import OpenAI
import time

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://api.apiyi.com/v1"
)

# Step 1: Submit (the OpenAI SDK auto-handles multipart when input_reference is provided)
with open("./reference.png", "rb") as f:
    video = client.videos.create(
        model="sora-2",
        prompt="Animate this scene: gentle waves lapping against the shore, leaves swaying in the breeze",
        seconds="8",
        size="1280x720",
        input_reference=f
    )
print(f"Video ID: {video.id}, status: {video.status}")

# Step 2: Poll
while True:
    video = client.videos.retrieve(video.id)
    print(f"Status: {video.status}, progress: {getattr(video, 'progress', 0)}%")
    if video.status == "completed":
        break
    if video.status == "failed":
        raise RuntimeError(f"Generation failed: {video}")
    time.sleep(15)

# Step 3: Download
client.videos.download_content(video.id).write_to_file("output.mp4")
print("Saved: output.mp4")

Python (Raw requests + multipart)

import requests
import time

API_KEY = "sk-your-api-key"
BASE_URL = "https://api.apiyi.com/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# Step 1: Multipart upload (image dimensions must equal size)
with open("./reference.png", "rb") as f:
    resp = requests.post(
        f"{BASE_URL}/videos",
        headers=HEADERS,  # Don't manually set Content-Type — requests handles the multipart boundary
        data={
            "model": "sora-2",
            "prompt": "Animate this scene with cinematic camera push-in, soft golden hour lighting",
            "seconds": "8",
            "size": "1280x720"
        },
        files={
            "input_reference": ("reference.png", f, "image/png")
        },
        timeout=60  # Multipart uploads of large images can be slow; use a 60-second timeout
    ).json()
video_id = resp["id"]
print(f"Video ID: {video_id}, status: {resp['status']}")

# Step 2: Poll
deadline = time.time() + 900
while time.time() < deadline:
    status_resp = requests.get(f"{BASE_URL}/videos/{video_id}", headers=HEADERS).json()
    print(f"Status: {status_resp['status']}, progress: {status_resp.get('progress', 0)}%")
    if status_resp["status"] == "completed":
        break
    if status_resp["status"] == "failed":
        raise RuntimeError(f"Generation failed: {status_resp}")
    time.sleep(15)

# Step 3: Download
with requests.get(f"{BASE_URL}/videos/{video_id}/content", headers=HEADERS, stream=True) as r:
    r.raise_for_status()
    with open("output.mp4", "wb") as f:
        for chunk in r.iter_content(chunk_size=8192):
            f.write(chunk)
print("Saved: output.mp4")

cURL

{/* Step 1: Multipart upload + submit */}
curl -X POST "https://api.apiyi.com/v1/videos" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "model=sora-2" \
  -F "prompt=Animate this scene: gentle waves lapping, leaves swaying, cinematic" \
  -F "seconds=8" \
  -F "size=1280x720" \
  -F "input_reference=@./reference.png;type=image/png"

{/* Step 2: Poll */}
curl -X GET "https://api.apiyi.com/v1/videos/video_abc123" \
  -H "Authorization: Bearer sk-your-api-key"

{/* Step 3: Download */}
curl -X GET "https://api.apiyi.com/v1/videos/video_abc123/content" \
  -H "Authorization: Bearer sk-your-api-key" \
  -o output.mp4

Node.js (fetch + FormData)

import fs from 'node:fs';
import { fileFromPath } from 'formdata-node/file-from-path';
import { FormData } from 'formdata-node';

const API_KEY = 'sk-your-api-key';
const BASE_URL = 'https://api.apiyi.com/v1';

// Step 1: Multipart upload
const form = new FormData();
form.set('model', 'sora-2');
form.set('prompt', 'Animate this scene with cinematic camera push-in, soft lighting');
form.set('seconds', '8');
form.set('size', '1280x720');
form.set('input_reference', await fileFromPath('./reference.png'));

const submitResp = await fetch(`${BASE_URL}/videos`, {
    method: 'POST',
    headers: { 'Authorization': `Bearer ${API_KEY}` },  // Don't manually set Content-Type
    body: form
});
const { id: videoId } = await submitResp.json();
console.log(`Video ID: ${videoId}`);

// Step 2: Poll
let status = 'queued';
while (status !== 'completed' && status !== 'failed') {
    await new Promise(r => setTimeout(r, 15000));
    const data = await (await fetch(`${BASE_URL}/videos/${videoId}`, {
        headers: { 'Authorization': `Bearer ${API_KEY}` }
    })).json();
    status = data.status;
    console.log(`Status: ${status}, progress: ${data.progress ?? 0}%`);
}

if (status === 'failed') throw new Error('Generation failed');

// Step 3: Download
const contentResp = await fetch(`${BASE_URL}/videos/${videoId}/content`, {
    headers: { 'Authorization': `Bearer ${API_KEY}` }
});
fs.writeFileSync('output.mp4', Buffer.from(await contentResp.arrayBuffer()));
console.log('Saved: output.mp4');

Browser JavaScript

{/* Demo only; route through your backend in production to avoid leaking the API key. */}
const fileInput = document.getElementById('refImage');  // <input type="file" />
const file = fileInput.files[0];

const form = new FormData();
form.append('model', 'sora-2');
form.append('prompt', 'Animate this scene, gentle motion');
form.append('seconds', '4');
form.append('size', '1280x720');
form.append('input_reference', file);

const submitResp = await fetch('https://api.apiyi.com/v1/videos', {
    method: 'POST',
    headers: { 'Authorization': 'Bearer sk-your-api-key' },
    body: form
});
const { id } = await submitResp.json();
console.log('Video ID:', id);

{/* After polling completes, route the video URL through a backend proxy to avoid downloading large files in the browser. */}

Parameters Quick Reference

Parameter	Type	Required	Default	Description
`model`	string	Yes	—	`sora-2` (720p only) or `sora-2-pro` (720p / 1024p / 1080p tiers)
`prompt`	string	Yes	—	Video description; focus on how the static image should animate (camera motion, object motion, lighting changes)
`seconds`	string	No	`"4"`	Duration as string enum: `"4"` / `"8"` / `"12"`
`size`	string	No	`720x1280`	Output resolution, must equal the `input_reference` image dimensions exactly
`input_reference`	file	Yes	—	Reference image file: `image/jpeg` / `image/png` / `image/webp`, dimensions must equal `size`

Detailed parameter constraints, allowed values, and examples are visible in the right-hand Playground. input_reference must be uploaded via multipart — URLs and base64 are not accepted.

Reference Image Preparation

Pick the target resolution

Choose size first based on your use case: portrait 720x1280, landscape 1280x720, Pro 1080p landscape 1920x1080, etc.

Crop locally to exact pixels

Use Pillow / ffmpeg to crop the image to the target dimensions:

from PIL import Image
img = Image.open("source.jpg")
img = img.resize((1280, 720), Image.LANCZOS)  # Or crop first then resize to preserve aspect ratio
img.save("reference.png")

Or one-line ffmpeg:

ffmpeg -i source.jpg -vf "scale=1280:720" reference.png

Pick the right format

Prefer PNG (lossless, ideal for illustrations / screenshots), JPEG for photos to save bytes, WebP if you need transparency.

Focus the prompt on "motion" not "appearance"

The reference image already defines the visuals. The prompt should focus on how it should animate: camera push/pull, object motion, lighting changes, character expressions, etc. Example: "Camera slowly pushes in, leaves gently swaying, sunlight flickering through branches".

Response Format

The response shape is identical to Text-to-Video: submit returns id + status: "queued", polling reports progress, completion downloads via /v1/videos/{id}/content as MP4.

{
  "id": "video_abc123def456",
  "object": "video",
  "model": "sora-2",
  "status": "queued",
  "progress": 0,
  "created_at": 1712697600,
  "size": "1280x720",
  "seconds": "8",
  "quality": "standard"
}

⚠️ Common 400 errors

Inpaint image must match the requested width and height — reference image dimensions don’t match size. Most common. Validate dimensions client-side before upload
Invalid file format — uploaded file is not jpeg / png / webp, or is corrupted
Missing required parameter: input_reference — multipart field name is wrong (must be input_reference, not image or reference)
seconds must be one of "4", "8", "12" — passed integer 4 instead of string "4"

Image-to-video and text-to-video have the same per-second pricing (billed by seconds); uploading a reference image does not cost extra. See the pricing table.

Authorizations

Authorization

string

header

required

API Key from the APIYI console (must use Sora2官转 group + usage-based billing)

Body

multipart/form-data

model

enum<string>

default:sora-2

required

Model ID. sora-2 supports 720p only; sora-2-pro supports 720p / 1024p / 1080p

Available options:

sora-2,

sora-2-pro

prompt

string

required

Video generation prompt. Focus on how the image should animate: camera motion, object motion, lighting changes

Example:

"Animate this scene: gentle waves lapping, leaves swaying, cinematic camera push-in"

input_reference

file

required

Reference image file used as the video's starting frame / visual anchor.

Accepted formats: image/jpeg / image/png / image/webp
Dimensions must equal size, otherwise you get Inpaint image must match the requested width and height
Only one file is supported; field name is fixed as input_reference

seconds

enum<string>

default:4

Video duration as string enum: "4" / "8" / "12"

Available options:

4,

8,

12

size

enum<string>

default:720x1280

Output resolution. Must exactly match the input_reference image dimensions:

sora-2 (720p only): 720x1280 / 1280x720
sora-2-pro additionally: 1024x1792 / 1792x1024 / 1080x1920 / 1920x1080

Available options:

720x1280,

1280x720,

1024x1792,

1792x1024,

1080x1920,

1920x1080

Response

Task submitted, returns video_id with queued status

string

Task ID for subsequent polling and download

Example:

"video_abc123def456"

object

string

Object type, fixed video

Example:

"video"

model

string

Model ID used for this task

Example:

"sora-2"

status

enum<string>

Task status:

queued — submitted, waiting in queue
in_progress — generating
completed — done, ready to download (/v1/videos/{id}/content)
failed — failed (not billed), safe to retry

Available options:

queued,

in_progress,

completed,

failed

Example:

"queued"

progress

integer

Generation progress percentage (0–100), not strictly linear

Example:

0

created_at

integer

Task creation Unix timestamp (seconds)

Example:

1712697600

completed_at

integer

Task completion Unix timestamp (seconds), present only on completed status

Example:

1712697900

size

string

Actual output resolution (matches the requested size)

Example:

"1280x720"

seconds

string

Actual duration generated (matches the requested seconds)

Example:

"8"

quality

string

Quality tier (standard for sora-2, high for sora-2-pro)

Example:

"standard"

Basics

Basic API

Image API

Video API

Multimodal Understanding API

Text API

Code Samples

Python (OpenAI SDK Drop-In)

Python (Raw requests + multipart)

cURL

Node.js (fetch + FormData)

Browser JavaScript

Parameters Quick Reference

Reference Image Preparation

Response Format

Authorizations

Body

Response

Basics

Basic API

Image API

Video API

Multimodal Understanding API

Text API

​Code Samples

​Python (OpenAI SDK Drop-In)

​Python (Raw requests + multipart)

​cURL

​Node.js (fetch + FormData)

​Browser JavaScript

​Parameters Quick Reference

​Reference Image Preparation

​Response Format

Authorizations

Body

Response

Code Samples

Python (OpenAI SDK Drop-In)

Python (Raw requests + multipart)

cURL

Node.js (fetch + FormData)

Browser JavaScript

Parameters Quick Reference

Reference Image Preparation

Response Format