TL;DR
APIYI fully supports OpenAI’s official web search: use the Responses API (/v1/responses) with the web_search tool. Both gpt-5.5 and gpt-5.4 were verified to genuinely search the web and return up-to-date information with source citations. A default-group key works out of the box — no special activation needed.
Real-world availability (test data, 2026-06-11)
| Model | Web result | Citations | Searches per Q&A | Latency |
|---|---|---|---|---|
| gpt-5.4 | ✅ Accurate same-week news | ✅ Structured url_citation | 1 | ~11s |
| gpt-5.5 | ✅ Accurate same-week news (auto time-window scoping, multi-source cross-checking) | ✅ Structured url_citation | ~8 | ~51s |
Quick start
cURL
Python (OpenAI SDK)
Response structure
Theoutput array contains, in execution order:
| item type | Meaning |
|---|---|
web_search_call | One actually executed search (billing counts these entries) |
reasoning | The model’s reasoning process (gpt-5 series) |
message | The final answer; its content[].annotations contains url_citation (title + url) |
status: "completed" means it finished normally; incomplete usually means max_output_tokens was too small — increase it.
Billing (important)
Web search incurs a tool-call fee, made up of two parts:| Item | Price | Notes |
|---|---|---|
| Tool-call fee | $10 / 1,000 calls ($0.01 per call) | Tool name: web_search; counted by the number of web_search_call entries in the response output — one question may trigger multiple searches (gpt-5.4 usually 1, gpt-5.5 usually 5–8) |
| Retrieved-content token fee | Standard model input price | Search results are injected into the model context and billed as input tokens. This is usually the larger share: measured at roughly 9k input tokens per Q&A for gpt-5.4, and 48–54k for gpt-5.5 |
Measured total cost per web-enabled Q&A: gpt-5.4 ≈ $0.01 search fee + 9k tokens; gpt-5.5 ≈ $0.08 search fee + ~50k tokens. Estimate against your expected query volume.
Notes
- Use the Responses API — do not use Chat Completions’
web_search_options: gpt-5 series models do not support that parameter (official OpenAI behavior; returns 400Unknown parameter: 'web_search_options').web_search_optionsonly applies to the dedicated*-search-previewmodels. - Set
max_output_tokensto at least 8192: gpt-5.5 consumes many reasoning tokens; a small limit returnsstatus: "incomplete"with no final answer, while tokens are still billed. - The legacy tool type
web_search_previewalso works with identical behavior; for new integrations, useweb_searchdirectly. - To control cost, constrain search behavior in the prompt (e.g. “search at most 2 times”) or use gpt-5.4.
FAQ
Q: How do I confirm the answer really used the web? A: Check whether the responseoutput contains entries with type="web_search_call" and whether the message annotations include url_citation. Both present means real web access; answer text alone without these two markers means the model answered from training data.
Q: Do I need a different group or a special key?
A: No. For OpenAI models, a default-group key can call web search directly.
Q: Which models are supported?
A: gpt-5.5 and gpt-5.4 are verified. Other gpt-5 series models should also support the Responses API web_search tool in principle — run the verification check from the FAQ above before relying on it.
Related Docs
OpenAI Native Calls (Responses API)
Responses API endpoint, parameters, and setup
OpenAI Prompt Caching
The large input-token volume injected by web search pairs well with caching