Image Edit API Reference - API易文档中心

Image Edit: edit or fuse reference images by instruction

curl --request POST \
  --url https://api.apiyi.com/v1/images/edits \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form model=gpt-image-2 \
  --form 'prompt=Place subject from image 1 into scene from image 2, using color style from image 3' \
  --form 'image=<string>' \
  --form image.items='@example-file' \
  --form mask='@example-file'

{
  "created": 1776832476,
  "data": [
    {
      "b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."
    }
  ],
  "usage": {
    "input_tokens": 1280,
    "output_tokens": 6240,
    "total_tokens": 7520
  }
}

POST

images

edits

Image Edit: edit or fuse reference images by instruction

curl --request POST \
  --url https://api.apiyi.com/v1/images/edits \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: multipart/form-data' \
  --form model=gpt-image-2 \
  --form 'prompt=Place subject from image 1 into scene from image 2, using color style from image 3' \
  --form 'image=<string>' \
  --form image.items='@example-file' \
  --form mask='@example-file'

{
  "created": 1776832476,
  "data": [
    {
      "b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."
    }
  ],
  "usage": {
    "input_tokens": 1280,
    "output_tokens": 6240,
    "total_tokens": 7520
  }
}

The interactive Playground on the right supports direct local image upload. Fill in your API Key in Authorization (format: Bearer sk-xxx), select image / mask files, fill in prompt and model, and send.

Use case: This page is for “edit / fuse / inpaint based on one or more reference images”. Request format is multipart/form-data. For pure text-to-image, use the Text-to-Image endpoint.

🖥️ Browser Playground limitation (important)This endpoint returns a raw base64 string (typically several MB) in the response. Due to browser rendering limits, the Playground on the right may show 请求时发生错误: unable to complete request after the response arrives — the request actually succeeded; the browser just can’t render such a long base64 string.Recommended workflow (beginner-friendly):

Copy the Python / Node.js / cURL sample below and run it locally. The code automatically base64.b64decodes the response and writes the image to a file.
If you must use the in-browser Playground, use a tiny reference image (< 50KB), set size to the smallest tier (e.g. 1024x1024), and quality to low.

⚠️ Key differences (when migrating from gpt-image-1.5)

Do not pass input_fidelity — gpt-image-2 forces high-fidelity; passing it returns 400
Edit requests have noticeably higher input tokens — references convert to many tokens via Vision pricing; budget accordingly
background: transparent not supported — use opaque or post-process
Multi-image fusion: max 16 — repeat the image[] field; more than 16 errors out

📎 Multi-image fusion order mattersThe image[] field accepts multiple reference images. Upload order maps to “image 1 / image 2 / image 3” references in the prompt. Reference them explicitly:

Place subject from image 1 into scene from image 2, using color style from image 3

Per-file limit: under 50MB each (multipart file upload), formats: png / jpg / webp; in practice compress to within 1.5MB before uploading (see “Upload Size Limits” below).

Code Examples

Python (OpenAI SDK · single-image edit)

from openai import OpenAI
import base64

client = OpenAI(
    api_key="sk-your-api-key",
    base_url="https://api.apiyi.com/v1"
)

resp = client.images.edit(
    model="gpt-image-2",
    image=open("photo.png", "rb"),
    prompt="Replace the background with a seaside sunset, preserve subject details",
    size="1536x1024",
    quality="high"
)

# b64_json is raw base64 (no prefix) — decode manually
with open("edited.png", "wb") as f:
    f.write(base64.b64decode(resp.data[0].b64_json))

Python (OpenAI SDK · multi-image fusion)

resp = client.images.edit(
    model="gpt-image-2",
    image=[
        open("person.png", "rb"),
        open("scene.png", "rb"),
        open("style.png", "rb"),
    ],
    prompt="Place subject from image 1 into scene from image 2, using color style from image 3, keep lighting consistent",
    size="1536x1024",
    quality="high"
)

with open("fused.png", "wb") as f:
    f.write(base64.b64decode(resp.data[0].b64_json))

cURL (multi-image fusion)

curl -X POST "https://api.apiyi.com/v1/images/edits" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "model=gpt-image-2" \
  -F "prompt=Place subject from image 1 into scene from image 2, using color style from image 3" \
  -F "size=1536x1024" \
  -F "quality=high" \
  -F "image[][email protected]" \
  -F "image[][email protected]" \
  -F "image[][email protected]"

cURL (mask inpainting)

curl -X POST "https://api.apiyi.com/v1/images/edits" \
  -H "Authorization: Bearer sk-your-api-key" \
  -F "model=gpt-image-2" \
  -F "prompt=Replace the sky with pink sunset clouds" \
  -F "size=1024x1024" \
  -F "quality=high" \
  -F "image[][email protected]" \
  -F "[email protected]" \
  | jq -r '.data[0].b64_json' | base64 -d > photo_edited.png

Node.js (Native fetch + FormData · multi-image fusion)

import fs from 'node:fs';

const form = new FormData();
form.append('model', 'gpt-image-2');
form.append('prompt', 'Place subject from image 1 into scene from image 2');
form.append('size', '1536x1024');
form.append('quality', 'high');
form.append('image[]', new Blob([fs.readFileSync('./person.png')]), 'person.png');
form.append('image[]', new Blob([fs.readFileSync('./scene.png')]), 'scene.png');

const resp = await fetch('https://api.apiyi.com/v1/images/edits', {
    method: 'POST',
    headers: { 'Authorization': 'Bearer sk-your-api-key' },
    body: form
});

const { data } = await resp.json();
fs.writeFileSync('fused.png', Buffer.from(data[0].b64_json, 'base64'));

Parameter Reference

Field	Type	Required	Default	Description
`model`	text	Yes	—	Fixed: `gpt-image-2`
`prompt`	text	Yes	—	Edit / fusion instruction
`image[]`	file	Yes	—	Reference images, can repeat (max 16)
`mask`	file	No	—	Mask image (only applies to first image, alpha channel required)
`size`	text	No	`auto`	Output size, same as text-to-image
`quality`	text	No	`auto`	`low` / `medium` / `high` / `auto`
`output_format`	text	No	`png`	`png` / `jpeg` / `webp`
`output_compression`	text	No	—	0–100, only for `jpeg` / `webp`
`background`	text	No	`auto`	`auto` / `opaque` (not supported: `transparent`)

Upload Size Limits

Item	Limit	Notes
Reference image count	Up to 16	Repeat the `image[]` field
Per image (multipart file upload)	Under 50MB each	Formats: `png` / `jpg` / `webp`
Per image (base64 data URL)	Field length ~20MiB	This is a length limit on the URL/base64 string field (schema `maxLength: 20971520`) — not the same as the 50MB multipart cap; base64 inflates size by ~1/3, so keep original images within 15MB
Mask file	PNG under 4MB	Must match the original image’s dimensions, with an alpha channel

Don’t max out the total request size: even though the per-image cap is 50MB with up to 16 images, multiple near-cap images make a single request body enormous and prone to gateway / CDN / timeout failures. In practice, compress each image to within 1.5MB (JPEG quality 80-90) — success rate and generation speed both improve noticeably, and output quality is unrelated to input file size.

Mask Inpainting Requirements

Same size as original, PNG format, under 4MB
Must have alpha channel: transparent (alpha=0) = inpaint area, opaque = preserve
Mask only applies to the first image
Mask is a “soft guide” — the model may extend or contract around the masked region

Multi-turn iteration: feed the previous output back as the next call’s image[] with a new instruction to incrementally refine. Each round is independently token-billed — watch cumulative cost.

Response Format

{
    "created": 1776832476,
    "data": [
        {
            "b64_json": "iVBORw0KGgoAAAANSUhEUgAA..."
        }
    ],
    "usage": {
        "input_tokens": 1280,
        "output_tokens": 6240,
        "total_tokens": 7520
    }
}

b64_json is raw base64, without the data:image/...;base64, prefix — different from gpt-image-2-all. Decode it client-side to write a file, or prepend the prefix for browser rendering.

Edit requests’ input_tokens are typically significantly higher than text-to-image at the same size, because reference images are billed per Vision pricing rules. Multi-image fusion adds proportionally more input tokens per image.

Authorizations

Authorization

string

header

required

API Key obtained from APIYI Console

Body

multipart/form-data

model

enum<string>

default:gpt-image-2

required

Model name, fixed as gpt-image-2

Available options:

gpt-image-2

prompt

string

required

Edit/fusion instruction. For multi-image, use 'image 1 / image 2 / image 3' to reference upload order

Example:

"Place subject from image 1 into scene from image 2, using color style from image 3"

image

file[]

required

Reference images. For a single image, send the field once; for multiple images, repeat the same image field (e.g., -F [email protected] -F [email protected], max 16) — upload order maps to image 1 / image 2 / ... in the prompt. multipart file upload: each under 50MB, formats: png/jpg/webp; compress to within 1.5MB in practice

mask

file

Mask image (optional, only applies to first image). Requirements:

Same size as original
PNG format, under 4MB
Must have alpha channel (alpha=0 = inpaint area, opaque = preserve)

size

string

default:auto

Output size (same as text-to-image). Preset or constraint-satisfying custom size

Example:

"1536x1024"

quality

enum<string>

default:auto

Quality tier

Available options:

auto,

low,

medium,

high

output_format

enum<string>

default:png

Output format

Available options:

png,

jpeg,

webp

output_compression

integer

Output compression (0–100), only effective for jpeg/webp

Required range: 0 <= x <= 100

background

enum<string>

default:auto

Background mode. auto or opaque. Not supported: transparent

Available options:

auto,

opaque

Response

Image generated successfully

created

integer

Example:

1776832476

data

object[]

Generation results (this model returns 1 image per call)

Show child attributes

usage

object

Token usage for this call

Show child attributes

Text-to-Image API Reference GPT-Image Legacy

​Code Examples

​Python (OpenAI SDK · single-image edit)

​Python (OpenAI SDK · multi-image fusion)

​cURL (multi-image fusion)

​cURL (mask inpainting)

​Node.js (Native fetch + FormData · multi-image fusion)

​Parameter Reference

​Upload Size Limits

​Mask Inpainting Requirements

​Response Format

Authorizations

Body

Response

Code Examples

Python (OpenAI SDK · single-image edit)

Python (OpenAI SDK · multi-image fusion)

cURL (multi-image fusion)

cURL (mask inpainting)

Node.js (Native fetch + FormData · multi-image fusion)

Parameter Reference

Upload Size Limits

Mask Inpainting Requirements

Response Format