The interactive Playground on the right supports direct local image upload. Fill in your API Key in Authorization (format: Bearer sk-xxx), select image / mask files, fill in prompt and model, and send.
Use case: This page is for “edit / fuse / inpaint based on one or more reference images”. Request format is multipart/form-data. For pure text-to-image, use the Text-to-Image endpoint.
⚠️ Key differences (when migrating from gpt-image-1.5)
- Do not pass
input_fidelity — gpt-image-2 forces high-fidelity; passing it returns 400
- Edit requests have noticeably higher input tokens — references convert to many tokens via Vision pricing; budget accordingly
background: transparent not supported — use opaque or post-process
- Multi-image fusion: max 5 — repeat the
image[] field; more than 5 errors out
📎 Multi-image fusion order mattersThe image[] field accepts multiple reference images. Upload order maps to “image 1 / image 2 / image 3” references in the prompt. Reference them explicitly:
Place subject from image 1 into scene from image 2, using color style from image 3
Recommended ≤ 10MB per image, formats: png / jpg / webp.
Code Examples
Python (OpenAI SDK · single-image edit)
from openai import OpenAI
import base64
client = OpenAI(
api_key="sk-your-api-key",
base_url="https://api.apiyi.com/v1"
)
resp = client.images.edit(
model="gpt-image-2",
image=open("photo.png", "rb"),
prompt="Replace the background with a seaside sunset, preserve subject details",
size="1536x1024",
quality="high"
)
# b64_json is raw base64 (no prefix) — decode manually
with open("edited.png", "wb") as f:
f.write(base64.b64decode(resp.data[0].b64_json))
Python (OpenAI SDK · multi-image fusion)
resp = client.images.edit(
model="gpt-image-2",
image=[
open("person.png", "rb"),
open("scene.png", "rb"),
open("style.png", "rb"),
],
prompt="Place subject from image 1 into scene from image 2, using color style from image 3, keep lighting consistent",
size="1536x1024",
quality="high"
)
with open("fused.png", "wb") as f:
f.write(base64.b64decode(resp.data[0].b64_json))
cURL (multi-image fusion)
curl -X POST "https://api.apiyi.com/v1/images/edits" \
-H "Authorization: Bearer sk-your-api-key" \
-F "model=gpt-image-2" \
-F "prompt=Place subject from image 1 into scene from image 2, using color style from image 3" \
-F "size=1536x1024" \
-F "quality=high" \
-F "image[][email protected]" \
-F "image[][email protected]" \
-F "image[][email protected]"
cURL (mask inpainting)
curl -X POST "https://api.apiyi.com/v1/images/edits" \
-H "Authorization: Bearer sk-your-api-key" \
-F "model=gpt-image-2" \
-F "prompt=Replace the sky with pink sunset clouds" \
-F "size=1024x1024" \
-F "quality=high" \
-F "image[][email protected]" \
-F "[email protected]" \
| jq -r '.data[0].b64_json' | base64 -d > photo_edited.png
import fs from 'node:fs';
const form = new FormData();
form.append('model', 'gpt-image-2');
form.append('prompt', 'Place subject from image 1 into scene from image 2');
form.append('size', '1536x1024');
form.append('quality', 'high');
form.append('image[]', new Blob([fs.readFileSync('./person.png')]), 'person.png');
form.append('image[]', new Blob([fs.readFileSync('./scene.png')]), 'scene.png');
const resp = await fetch('https://api.apiyi.com/v1/images/edits', {
method: 'POST',
headers: { 'Authorization': 'Bearer sk-your-api-key' },
body: form
});
const { data } = await resp.json();
fs.writeFileSync('fused.png', Buffer.from(data[0].b64_json, 'base64'));
Parameter Reference
| Field | Type | Required | Default | Description |
|---|
model | text | Yes | — | Fixed: gpt-image-2 |
prompt | text | Yes | — | Edit / fusion instruction |
image[] | file | Yes | — | Reference images, can repeat (max 5) |
mask | file | No | — | Mask image (only applies to first image, alpha channel required) |
size | text | No | auto | Output size, same as text-to-image |
quality | text | No | auto | low / medium / high / auto |
output_format | text | No | png | png / jpeg / webp |
output_compression | text | No | — | 0–100, only for jpeg / webp |
background | text | No | auto | auto / opaque (not supported: transparent) |
Mask Inpainting Requirements
- Same size and format as original, ≤ 50MB
- Must have alpha channel: transparent (alpha=0) = inpaint area, opaque = preserve
- Mask only applies to the first image
- Mask is a “soft guide” — the model may extend or contract around the masked region
Multi-turn iteration: feed the previous output back as the next call’s image[] with a new instruction to incrementally refine. Each round is independently token-billed — watch cumulative cost.
{
"created": 1776832476,
"data": [
{
"b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."
}
],
"usage": {
"input_tokens": 1280,
"output_tokens": 6240,
"total_tokens": 7520
}
}
b64_json is raw base64, without the data:image/...;base64, prefix — different from gpt-image-2-all. Decode it client-side to write a file, or prepend the prefix for browser rendering.
Edit requests’ input_tokens are typically significantly higher than text-to-image at the same size, because reference images are billed per Vision pricing rules. Multi-image fusion adds proportionally more input tokens per image.
API Key obtained from APIYI Console
model
enum<string>
default:gpt-image-2
required
Model name, fixed as gpt-image-2
Available options:
gpt-image-2
Edit/fusion instruction. For multi-image, use 'image 1 / image 2 / image 3' to reference upload order
Example:"Place subject from image 1 into scene from image 2, using color style from image 3"
Reference images (max 5). Each ≤ 10MB, formats: png/jpg/webp
Maximum array length: 5
Mask image (optional, only applies to first image). Requirements:
- Same size and format as original
- Must have alpha channel (alpha=0 = inpaint area, opaque = preserve)
- Single file ≤ 50MB
Output size (same as text-to-image). Preset or constraint-satisfying custom size
Available options:
auto,
low,
medium,
high
Available options:
png,
jpeg,
webp
Output compression (0–100), only effective for jpeg/webp
Required range: 0 <= x <= 100
Background mode. auto or opaque. Not supported: transparent
Available options:
auto,
opaque
Image generated successfully
Generation results (this model returns 1 image per call)
Token usage for this call