Skip to main content

Overview

gpt-image-2 is OpenAI’s latest flagship image generation model — the upgrade to gpt-image-1.5. Core upgrades: any valid resolution (incl. 2K / 3840×2160 4K), auto high-fidelity on reference images, 20-30% cheaper at the same tier. APIYI’s gateway is fully compatible with the OpenAI Images API — point the official OpenAI SDK’s base_url here for zero-code direct connection.
🎨 Key highlights: Native support for any valid resolution (max 3840×2160 4K) + auto high-fidelity on reference image edits + 20-30% lower cost than 1.5 at same size and quality + native Chinese prompt support. Best for production scenarios that need precise size/quality control, must match the OpenAI official API exactly, or require 4K output.

Text-to-Image API

/v1/images/generations — generate images from text prompts with size / quality / output_format control.

Image Edit API

/v1/images/edits — multipart upload of reference images (up to 5) + edit/fusion instructions, with mask inpainting support.

Why Choose APIYI’s GPT-image-2 Official Relay?

Built on OpenAI’s official channel, deeply optimized for enterprise production workloads across reliability, cost, and integration experience:

Official Channel · Same as Official

Strictly routed through OpenAI’s official relay — requests and responses are 100% identical to OpenAI official: same fields, same error codes, same model behavior. Lossless quality, no silent rewrites.

No Concurrency Limits

Not bound by OpenAI’s Tier-based RPM / TPM ceilings. Enterprise-scale traffic scales linearly — batch generation and peak-load scenarios handled with ease.

Same Price + Up to 15% Off

Default unit price matches OpenAI’s official pricing. Stack with our top-up bonus events for up to 15% off — long-term cost drops noticeably.

Global Zero-Barrier Access

No overseas server or proxy required. Connect directly to api.apiyi.com from domestic data centers, home broadband, or overseas nodes — stable latency, no cross-border re-architecture.

Full Model Lineup

Seamlessly switch to the reverse-engineered gpt-image-2-all ($0.03/image flat), or the cost-leader Nano Banana Pro / 2 — mix and match per scenario.

Professional Enterprise Support

Our team specializes in production image-generation deployments, with deep experience in model selection, tuning, and integration — end-to-end support from PoC to production.

Core Features

Any Resolution (incl. 4K)

Supports any valid output size. Presets cover 1K / 2K / 3840×2160 4K. Custom sizes only need to satisfy basic constraints (edges as multiples of 16, ratio ≤ 3:1).

Auto High-Fidelity

Reference image editing automatically enables high-fidelity. Detail, character identity, and text retention dramatically improved. Do not pass input_fidelity (will error).

20-30% Cheaper

1024×1024 high quality drops from the $0.25 range of 1.5 to $0.211/image. 2K/4K is token-metered but trends down equally — long-term cost noticeably lower.

Chinese + Text Rendering

Native Chinese prompt support. Stable rendering of Chinese/English text in signage, posters, UI screenshots. Fine text is rarely blurry on high quality.

Multi-Image Fusion (up to 5)

image[] array accepts up to 5 reference images. Use “image 1 / image 2 / image 3” in the prompt to reference them by upload order.

Mask Inpainting

Upload an alpha-channel mask. Transparent regions are inpaint areas, opaque regions are preserved.

Multiple Output Formats

Supports png (default) / jpeg / webp. Set output_compression for jpeg/webp to control file size.

OpenAI SDK Direct

Point base_url to https://api.apiyi.com/v1 and call directly with the official OpenAI SDK — zero-code migration.

Pricing

Token-metered (sum of input text + input image + output image tokens). Official per-image pricing reference:
Quality1024×10241024×15361536×1024
Low$0.006$0.005$0.005
Medium$0.053$0.041$0.041
High$0.211$0.165$0.165
Pricing notes:
  • 2K / 4K has no fixed per-image price — billed by actual input + output tokens
  • Edit requests have noticeably higher input tokens than text-to-image due to forced high-fidelity
  • Streaming (stream: true + partial_images: N) costs an extra 100 output image tokens per partial
  • Compared to gpt-image-1.5 at the same size and quality, gpt-image-2 is about 20-30% cheaper

Technical Specifications

DimensionValue
Model namegpt-image-2
Speed~120 seconds (4K high quality approaches 2 min)
Output resolutionAny valid size (1K/2K/4K, max 3840×2160)
Quality tiersauto / low / medium / high
Output formatspng (default) / jpeg / webp
Chinese prompts✅ Native
Per call1 image (n=1)
Reference image limit5 (image[])
Mask inpainting✅ Supported (alpha channel required)
Transparent background❌ Not supported (background: transparent errors)
Response fieldb64_json (raw base64, no prefix)

Endpoints

EndpointPurposeContent-Type
POST /v1/images/generationsText-to-imageapplication/json
POST /v1/images/editsReference editing / multi-image fusion / mask inpaintingmultipart/form-data
Domain selection: api.apiyi.com is the primary domain. Other gateway domains like b.apiyi.com / vip.apiyi.com work identically.

Size Reference

Preset Sizes

sizeMeaningPixels
autoAdaptive (default)Model decides
1024x1024Square 1:11K
1536x1024Landscape 3:21K
1024x1536Portrait 2:31K
2048x2048Square 1:12K
2048x1152Landscape 16:92K
3840x2160Landscape 16:94K
2160x3840Portrait 9:164K

Custom Size Constraints

gpt-image-2 accepts any valid size that satisfies all of:
  1. Max edge ≤ 3840px
  2. Both edges are multiples of 16
  3. Aspect ratio ≤ 3:1
  4. Total pixels ∈ [655,360, 8,294,400] (~0.65MP to ~8.3MP)
Valid examples: 1600x1200, 1792x1024, 2048x1536, 3200x1800 Invalid examples: 1000x1000 (not multiple of 16), 4000x4000 (over max), 3840x1000 (ratio > 3:1)
Outputs above 2560×1440 (~3.69MP) are officially marked experimental and may show quality fluctuations. For production, prefer presets like 2048x1152 / 2048x2048 / 3840x2160.

Best Practices

1

Prefer preset sizes

The 8 official presets are tuned for stable speed and quality. Reserve custom sizes for genuinely unusual aspect ratios.
2

Match quality to scenario

Drafts / batch → low; daily / final → medium; text, fine textures, print → high.
3

Choose JPEG output

For final display, output_format=jpeg + output_compression=85 is faster than PNG and roughly half the size.
4

Lock high for text scenarios

Text rendering is a key strength but lower tiers can still blur. Lock quality=high for signage and poster scenarios.
5

Prepare reference images

Each image ≤ 10MB; PNG/JPEG/WebP supported; up to 5 images; reference order with “image 1 / image 2” in the prompt.
6

Timeout ≥ 360 seconds

quality=high + 2K/4K realistically takes several minutes. Configuring around the “~120s” headline figure causes many false timeouts. Set 360s as a conservative baseline; show progress in the UI; consider a task queue server-side.
7

Migration notes

Migrating from gpt-image-1.5: drop input_fidelity (forced high-fidelity, will error if passed); avoid background: transparent (not supported).

Errors & Retries

StatusMeaningSuggested action
400Invalid parameters (size constraint violation, unsupported field, etc.)Validate against size constraints; do not pass input_fidelity / background: transparent
401Invalid tokenCheck Bearer Token
403Content moderation blockAdjust prompt or pass moderation: low
429Rate limit / insufficient balanceExponential backoff
5xxGateway / backend errorRetry 1–2 times
TimeoutLong tailClient timeout ≥ 360 seconds (high + 2K/4K can run 3-5 minutes)
Client recommendations:
  • Request timeout starts at 360 seconds (conservative baseline; quality=high + 2K/4K can take 3-5 minutes, and configuring around the ~120s figure causes many false timeouts)
  • Exponential backoff for 5xx and timeouts (suggest 2 retries)
  • Log x-request-id header for support

FAQ

Yes. gpt-image-2 returns a raw base64 string (no prefix), unlike gpt-image-2-all. Two client patterns:
  • Write file: base64.b64decode(b64_str) → write to disk
  • Browser render: img.src = 'data:image/png;base64,' + b64_str (prepend manually)
If your code assumes the 1.5-era “already prefixed” behavior, you’ll get a corrupted data URL — handle this explicitly.
gpt-image-2 forces high-fidelity processing of reference images and no longer accepts input_fidelity. When migrating from 1.5, just remove this field — no replacement needed.
gpt-image-2 does not support background: transparent (will error). Two workarounds:
  • Set background to opaque (or omit) and key out transparency yourself with PIL / sharp / online tools
  • Temporarily fall back to gpt-image-1.5 for scenarios that genuinely need transparency
1 image (n=1). For N images, issue N parallel requests. Each is independently token-billed.
Higher resolution and higher quality require more output image tokens, which naturally takes longer. 3840×2160 + quality=high realistically approaches 2 minutes. Recommendations:
  • Client timeout ≥ 360 seconds (conservative)
  • Show “generating” progress in the UI
  • Use 1024×1024 / 1536×1024 1K presets when 4K isn’t needed
Because gpt-image-2 auto-enables high-fidelity processing of reference images, the references themselves convert to large input token counts via the Vision pricing rules. Edit input tokens are noticeably higher than text-to-image — budget accordingly.
  • Same size and format as the original, ≤ 50MB
  • Must have alpha channel: transparent (alpha=0) = inpaint area, opaque = preserve
  • Only applies to the first image
  • Mask is a “soft guide” — the model may extend or contract around the masked region
PickWhen
gpt-image-2 (Official)Need precise size/quality control, must match OpenAI official exactly, want 4K output, need mask inpainting
gpt-image-2-all (Reverse)Want flat $0.03/image, ~30s render, minimal parameters, strong consistency / Chinese text
Yes — zero code change. Point base_url to https://api.apiyi.com/v1 and set api_key to your APIYI token:
from openai import OpenAI
client = OpenAI(api_key="sk-your-key", base_url="https://api.apiyi.com/v1")
resp = client.images.generate(model="gpt-image-2", prompt="...", size="2048x1152", quality="high")
No. gpt-image-2 uses OpenAI’s official synchronous endpoint — once a request is submitted, it runs to completion with no “cancel” signal. Even if the client disconnects, the server still finishes generation and bills normally. Configure client-side timeouts carefully — do not assume “disconnect = no charge”.
Default 100 RPM (100 requests per minute). Actual usable RPM is also dynamically adjusted by overall platform concurrency. If your workload needs more, contact us with your estimated QPS / RPM and we can provision additional capacity.
No. gpt-image-2 strictly mirrors the OpenAI official API — synchronous only. The request blocks until the result is returned (high + 4K realistically 1–2 minutes). If you need an async queue or callback mechanism:
  • Wrap it yourself with a task queue (Celery / BullMQ, etc.) at the business layer
  • Or use gpt-image-2-all — generates in ~30s, easier to poll from the front end
No. OpenAI’s built-in content moderation rejects unsafe / malformed requests with a 400 error, and no charge is incurred. Typical response:
{
  "status_code": 400,
  "error": {
    "message": "Your request was rejected by the safety system. ...",
    "type": "shell_api_error",
    "code": "moderation_blocked"
  }
}
Other zero-cost errors: 401 (invalid token), 429 (rate limit). Token billing only kicks in once the request actually reaches the model generation stage (i.e., 200 + b64_json received).
gpt-image-2 is OpenAI’s official flagship, billed by token. If you prioritize flat pricing ($0.03/image) and faster generation (~30s), see gpt-image-2-all.