Skip to main content
POST
/
v1
/
images
/
edits
Image Edit: edit or fuse reference images by instruction
curl --request POST \
  --url https://api.apiyi.com/v1/images/edits \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form model=gpt-image-2 \
  --form 'prompt=Place subject from image 1 into scene from image 2, using color style from image 3' \
  --form 'image[]=<string>' \
  --form mask='@example-file' \
  --form size=1536x1024 \
  --form quality=auto \
  --form output_format=png \
  --form output_compression=50 \
  --form background=auto \
  --form image[].items='@example-file'
{
  "created": 1776832476,
  "data": [
    {
      "b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."
    }
  ],
  "usage": {
    "input_tokens": 1280,
    "output_tokens": 6240,
    "total_tokens": 7520
  }
}
The interactive Playground on the right supports direct local image upload. Fill in your API Key in Authorization (format: Bearer sk-xxx), select image / mask files, fill in prompt and model, and send.
Use case: This page is for “edit / fuse / inpaint based on one or more reference images”. Request format is multipart/form-data. For pure text-to-image, use the Text-to-Image endpoint.
⚠️ Key differences (when migrating from gpt-image-1.5)
  • Do not pass input_fidelitygpt-image-2 forces high-fidelity; passing it returns 400
  • Edit requests have noticeably higher input tokens — references convert to many tokens via Vision pricing; budget accordingly
  • background: transparent not supported — use opaque or post-process
  • Multi-image fusion: max 5 — repeat the image[] field; more than 5 errors out
📎 Multi-image fusion order mattersThe image[] field accepts multiple reference images. Upload order maps to “image 1 / image 2 / image 3” references in the prompt. Reference them explicitly:
Place subject from image 1 into scene from image 2, using color style from image 3
Recommended ≤ 10MB per image, formats: png / jpg / webp.

Code Examples

Python (OpenAI SDK · single-image edit)

from openai import OpenAI
import base64

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://api.apiyi.com/v1"
)

resp = client.images.edit(
    model="gpt-image-2",
    image=open("photo.png", "rb"),
    prompt="Replace the background with a seaside sunset, preserve subject details",
    size="1536x1024",
    quality="high"
)

# b64_json is raw base64 (no prefix) — decode manually
with open("edited.png", "wb") as f:
    f.write(base64.b64decode(resp.data[0].b64_json))

Python (OpenAI SDK · multi-image fusion)

resp = client.images.edit(
    model="gpt-image-2",
    image=[
        open("person.png", "rb"),
        open("scene.png", "rb"),
        open("style.png", "rb"),
    ],
    prompt="Place subject from image 1 into scene from image 2, using color style from image 3, keep lighting consistent",
    size="1536x1024",
    quality="high"
)

with open("fused.png", "wb") as f:
    f.write(base64.b64decode(resp.data[0].b64_json))

cURL (multi-image fusion)

curl -X POST "https://api.apiyi.com/v1/images/edits" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "model=gpt-image-2" \
  -F "prompt=Place subject from image 1 into scene from image 2, using color style from image 3" \
  -F "size=1536x1024" \
  -F "quality=high" \
  -F "image[][email protected]" \
  -F "image[][email protected]" \
  -F "image[][email protected]"

cURL (mask inpainting)

curl -X POST "https://api.apiyi.com/v1/images/edits" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "model=gpt-image-2" \
  -F "prompt=Replace the sky with pink sunset clouds" \
  -F "size=1024x1024" \
  -F "quality=high" \
  -F "image[][email protected]" \
  -F "[email protected]" \
  | jq -r '.data[0].b64_json' | base64 -d > photo_edited.png

Node.js (Native fetch + FormData · multi-image fusion)

import fs from 'node:fs';

const form = new FormData();
form.append('model', 'gpt-image-2');
form.append('prompt', 'Place subject from image 1 into scene from image 2');
form.append('size', '1536x1024');
form.append('quality', 'high');
form.append('image[]', new Blob([fs.readFileSync('./person.png')]), 'person.png');
form.append('image[]', new Blob([fs.readFileSync('./scene.png')]), 'scene.png');

const resp = await fetch('https://api.apiyi.com/v1/images/edits', {
    method: 'POST',
    headers: { 'Authorization': 'Bearer sk-your-api-key' },
    body: form
});

const { data } = await resp.json();
fs.writeFileSync('fused.png', Buffer.from(data[0].b64_json, 'base64'));

Parameter Reference

FieldTypeRequiredDefaultDescription
modeltextYesFixed: gpt-image-2
prompttextYesEdit / fusion instruction
image[]fileYesReference images, can repeat (max 5)
maskfileNoMask image (only applies to first image, alpha channel required)
sizetextNoautoOutput size, same as text-to-image
qualitytextNoautolow / medium / high / auto
output_formattextNopngpng / jpeg / webp
output_compressiontextNo0–100, only for jpeg / webp
backgroundtextNoautoauto / opaque (not supported: transparent)

Mask Inpainting Requirements

  • Same size and format as original, ≤ 50MB
  • Must have alpha channel: transparent (alpha=0) = inpaint area, opaque = preserve
  • Mask only applies to the first image
  • Mask is a “soft guide” — the model may extend or contract around the masked region
Multi-turn iteration: feed the previous output back as the next call’s image[] with a new instruction to incrementally refine. Each round is independently token-billed — watch cumulative cost.

Response Format

{
    "created": 1776832476,
    "data": [
        {
            "b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."
        }
    ],
    "usage": {
        "input_tokens": 1280,
        "output_tokens": 6240,
        "total_tokens": 7520
    }
}
b64_json is raw base64, without the data:image/...;base64, prefix — different from gpt-image-2-all. Decode it client-side to write a file, or prepend the prefix for browser rendering.
Edit requests’ input_tokens are typically significantly higher than text-to-image at the same size, because reference images are billed per Vision pricing rules. Multi-image fusion adds proportionally more input tokens per image.

Authorizations

Authorization
string
header
required

API Key obtained from APIYI Console

Body

multipart/form-data
model
enum<string>
default:gpt-image-2
required

Model name, fixed as gpt-image-2

Available options:
gpt-image-2
prompt
string
required

Edit/fusion instruction. For multi-image, use 'image 1 / image 2 / image 3' to reference upload order

Example:

"Place subject from image 1 into scene from image 2, using color style from image 3"

image[]
file[]
required

Reference images (max 5). Each ≤ 10MB, formats: png/jpg/webp

Maximum array length: 5
mask
file

Mask image (optional, only applies to first image). Requirements:

  • Same size and format as original
  • Must have alpha channel (alpha=0 = inpaint area, opaque = preserve)
  • Single file ≤ 50MB
size
string
default:auto

Output size (same as text-to-image). Preset or constraint-satisfying custom size

Example:

"1536x1024"

quality
enum<string>
default:auto

Quality tier

Available options:
auto,
low,
medium,
high
output_format
enum<string>
default:png

Output format

Available options:
png,
jpeg,
webp
output_compression
integer

Output compression (0–100), only effective for jpeg/webp

Required range: 0 <= x <= 100
background
enum<string>
default:auto

Background mode. auto or opaque. Not supported: transparent

Available options:
auto,
opaque

Response

Image generated successfully

created
integer
Example:

1776832476

data
object[]

Generation results (this model returns 1 image per call)

usage
object

Token usage for this call