Skip to main content

Overview

Beyond the standalone text-to-image / image-edit endpoints, APIYI also supports the OpenAI Responses API’s native image_generation tool: the main model gpt-5.5 decides on its own when to draw, internally selects a GPT Image model, and returns the image as base64 in the response output array.
Verified working (2026-06-17): gpt-5.5 + POST /v1/responses + tools: [{"type": "image_generation"}] returns a valid base64 PNG. Both image paths route directly through OpenAI’s official upstream.
Which should you use? For the vast majority of “I just want an image” cases, prefer the standalone /v1/images/generations endpoint — it bills purely by actual usage, which is cheaper and more controllable. Only use the native tool method on this page when your pipeline must go through Responses (e.g. letting gpt-5.5 autonomously decide whether to draw inside an Agent conversation). It adds a fixed tool-call fee of roughly $0.20 per image.

Comparison of the two methods

AspectNative tool method (this page)images API
ChannelDirect OpenAI official forwardingDirect OpenAI official forwarding
Endpoint/v1/responses/v1/images/generations, /v1/images/edits
Toolimage_generationNone (pass prompt directly)
BillingUsage-based + tool-call feeUsage-based
Billing detailText/image input-output priced same as official; fixed tool-call fee ≈ $0.20/callText/image input-output priced same as official
Best forCases that require Responses (e.g. Agent autonomy)Most image scenarios — more reasonable billing
Core difference: the native tool method adds a fixed ≈$0.20 tool fee per image, while the images API bills purely by actual usage — so it is cheaper in most cases.

Minimal request

cURL

curl https://api.apiyi.com/v1/responses \
  -H "Authorization: Bearer $APIYI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "input": "Generate an image of a gray tabby cat hugging an otter with an orange scarf",
    "tools": [
      { "type": "image_generation" }
    ]
  }'

Python (requests)

import base64, requests

resp = requests.post(
    "https://api.apiyi.com/v1/responses",
    headers={
        "Authorization": "Bearer $APIYI_KEY",
        "Content-Type": "application/json",
    },
    json={
        "model": "gpt-5.5",
        "input": "Generate an image of a gray tabby cat hugging an otter with an orange scarf",
        "tools": [{"type": "image_generation"}],
    },
    timeout=300,           # Generation is slow; allow plenty of timeout (~60-90s per image)
)
data = resp.json()

# Pull the image tool result out of the output array
for item in data["output"]:
    if item.get("type") == "image_generation_call":
        raw = base64.b64decode(item["result"])   # result field is a base64 image
        with open("output.png", "wb") as f:
            f.write(raw)
        print("Saved output.png,", len(raw), "bytes")
Optional parameters go inside the tools item: {"type": "image_generation", "output_format": "png|jpeg|webp", "size": "1024x1024", ...}. Omit them to use the defaults (png).

Response structure (key fields)

On success (HTTP 200), the response body contains:
{
  "id": "resp_...",
  "model": "gpt-5.5-2026-04-23",
  "status": "completed",
  "output": [
    {
      "type": "image_generation_call",   // <- key: the tool actually fired
      "result": "<a very long base64 PNG string>"  // <- the image itself, base64, png by default
    },
    { "type": "message", "content": [ /* may be empty; image responses don't always include text */ ] }
  ],
  "usage": { "input_tokens": 2347, "output_tokens": 74 }
}
How to tell whether an image was actually produced:
  • Success: output contains type="image_generation_call", and result decodes to a valid image starting with \x89PNG.
  • ⚠️ Silently stripped: HTTP 200 but no image_generation_call in output, only text (common when a channel doesn’t support the tool).
  • Error: non-200, or returns unknown tool / no available channels, etc. For the latter two, fall back to /v1/images/generations.

💰 Billing

Take one real call as an example (input 2347 tokens, output 74 tokens, generating one 1122×1402 PNG). The final charge = $0.213954, which is correct. Breakdown:
ComponentQuota calculationUSD
Text portion(input 2347 + output 74×completion multiplier 6) × input multiplier 2.5 = 6977.5 quota≈ $0.014
Image tool portion≈ 100,000 quota (per image, independent of tokens)≈ $0.20 / image
Total106,977 quota$0.213954
Conversion: 500,000 quota = \$1 (derived from 106977 quota = \$0.213954).
A display quirk in the console detail page (explain this to customers proactively)On APIYI’s “conditional billing detail” page:
  • The top section only shows the text portion of the math (base cost = (2347 + 74×6) × 2.5 = 6977.50);
  • The image tool-call charge (≈100,000 quota / ≈$0.20) shows up as a blank row in the detail list — it isn’t rendered;
  • but it is correctly counted in the bottom-line “final quota 106977 / $0.213954”.
Conclusion: billing is normal and accurate — the detail UI simply fails to display the “image tool” row, so the line items don’t sum to the final total. When explaining to customers, emphasize: the total is correct; the difference is this image’s tool fee (≈$0.20/image), just not itemized separately.

Cost notes

  • The generation fee is fixed per image (≈$0.20/image) and does not vary with prompt length; the text token cost is small by comparison.
  • Each image takes ~60-90s; set a client timeout of ≥300s.
  • If you only need an image and don’t need the model to decide autonomously, the standalone /v1/images/generations endpoint is likely cheaper and more controllable.

Troubleshooting

SymptomLikely causeFix
200 but no image_generation_callCurrent channel doesn’t support the tool (silently stripped)Switch key/channel, or use /v1/images/generations
no available channelsNo matching channel under the key’s groupSwitch to a key group with GPT/image channels
Request timeoutGeneration is slowSet client timeout to 300s
result doesn’t decode to PNGOutput format changed / channel anomalyCheck output_format, verify the magic bytes