OpenAI Responses API Support - API易文档中心

APIYI fully supports OpenAI’s latest Responses API, the next-generation AI agent building interface launched in March 2025. The Responses API combines the simplicity of Chat Completions with the tool usage and state management capabilities of the Assistants API, providing developers with a more flexible and powerful AI application building experience.

Next-Generation API: Responses API is a superset of Chat Completions, providing Chat Completions functionality while also supporting advanced features like built-in tools and state management. — However, it only supports a few newer OpenAI models. See below for details.

🚀 Core Features

Built-in Tool Support

Rich tools including web search, file search, code interpreter, function calling, and more

State Management

Maintain conversation context and state through previous_response_id

Reasoning Persistence

O3/O4-mini reasoning tokens persist continuously across requests

Full Compatibility

Supports all tool-capable GPT-4.1 and O3 series models

📋 Supported Models

Reasoning Models (Recommended)

O3 Series: o3, o3-pro, o4-mini
Features: Reasoning tokens persist across requests, providing smarter context understanding

Chat Models

GPT-4.1 Series: gpt-4.1, gpt-4.1-mini
Features: Powerful tool calling and multimodal capabilities

Model Requirements: Only newer models support the /v1/responses endpoint. Older models like GPT-3.5 do not support this interface.

🔧 Basic Usage

Simple Conversation

curl https://api.apiyi.com/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4.1",
    "input": "Hello! How can you help me today?",
    "instructions": "You are a helpful assistant."
  }'

Actual Response Example

Based on test results, here’s the complete response format:

{
  "id": "resp_6884fcab4930819dbbc02f15cbe63f6c0a92c38ff214d10a",
  "object": "response",
  "created_at": 1753545899,
  "status": "completed",
  "background": false,
  "error": null,
  "incomplete_details": null,
  "instructions": "You are a helpful assistant.",
  "max_output_tokens": null,
  "max_tool_calls": null,
  "model": "gpt-4.1-2025-04-14",
  "output": [
    {
      "id": "msg_6884fcab8f18819dbcdf349f01b424f80a92c38ff214d10a",
      "type": "message",
      "status": "completed",
      "content": [
        {
          "type": "output_text",
          "annotations": [],
          "logprobs": [],
          "text": "Hello! How can I assist you today?"
        }
      ],
      "role": "assistant"
    }
  ],
  "parallel_tool_calls": true,
  "previous_response_id": null,
  "prompt_cache_key": null,
  "reasoning": {
    "effort": null,
    "summary": null
  },
  "safety_identifier": null,
  "service_tier": "default",
  "store": true,
  "temperature": 1.0,
  "text": {
    "format": {
      "type": "text"
    }
  },
  "tool_choice": "auto",
  "tools": [],
  "top_logprobs": 0,
  "top_p": 1.0,
  "truncation": "disabled",
  "usage": {
    "input_tokens": 19,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 10,
    "output_tokens_details": {
      "reasoning_tokens": 0
    },
    "total_tokens": 29
  },
  "user": null,
  "metadata": {}
}

📊 Request Parameters Explained

Required Parameters

Parameter	Type	Description
`model`	string	Model name, e.g. `gpt-4.1`, `o3`
`input`	string	User input content

Optional Parameters

Parameter	Type	Default	Description
`instructions`	string	null	System instructions defining assistant behavior
`previous_response_id`	string	null	Previous response ID for maintaining context
`temperature`	float	1.0	Controls output randomness (0-2)
`max_output_tokens`	int	null	Maximum output tokens
`tools`	array	[]	List of available tools
`tool_choice`	string	”auto”	Tool selection strategy
`parallel_tool_calls`	boolean	true	Whether to allow parallel tool calls
`store`	boolean	true	Whether to store conversation for training
`metadata`	object	Custom metadata

🛠️ Built-in Tool Support

1. Function Calling

response = client.responses.create(
    model="gpt-4.1",
    input="What's the weather like in Beijing?",
    instructions="You are a helpful weather assistant.",
    tools=[
        {
            "type": "function",
            "function": {
                "name": "get_weather",
                "description": "Get current weather for a city",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "city": {
                            "type": "string",
                            "description": "City name"
                        }
                    },
                    "required": ["city"]
                }
            }
        }
    ]
)

2. Code Interpreter

response = client.responses.create(
    model="gpt-4.1",
    input="Create a chart showing sales data: Jan:100, Feb:150, Mar:120",
    instructions="You are a data analyst. Use code interpreter to create visualizations.",
    tools=[{"type": "code_interpreter"}]
)

3. File Search

response = client.responses.create(
    model="gpt-4.1",
    input="Search for information about quarterly reports",
    instructions="You are a document analyst.",
    tools=[{"type": "file_search"}]
)

🔄 State Management

Maintaining Conversation Context

# First conversation turn
response1 = client.responses.create(
    model="gpt-4.1",
    input="My name is Alice. Please remember this.",
    instructions="You are a helpful assistant with good memory."
)

# Second conversation turn - use previous_response_id to maintain context
response2 = client.responses.create(
    model="gpt-4.1",
    input="What's my name?",
    instructions="You are a helpful assistant with good memory.",
    previous_response_id=response1.id
)

print(response2.output[0].content[0].text)  # Should answer "Alice"

Multi-turn Tool Calling

def multi_turn_conversation():
    response_id = None

    for user_input in ["What's 2+2?", "Now multiply that by 3", "And divide by 2"]:
        response = client.responses.create(
            model="o3",
            input=user_input,
            instructions="You are a math tutor. Show your reasoning.",
            previous_response_id=response_id,
            tools=[{"type": "code_interpreter"}]
        )

        print(f"User: {user_input}")
        print(f"Assistant: {response.output[0].content[0].text}")

        response_id = response.id  # Maintain context

📈 Reasoning Model Features

O3/O4-mini Reasoning Persistence

Reasoning models have special advantages in Responses API:

# Use O3 for complex reasoning
response = client.responses.create(
    model="o3",
    input="Solve this step by step: If a train travels 120km in 2 hours, then speeds up 20% for the next hour, how far did it travel in total?",
    instructions="Think through this problem step by step, showing all reasoning."
)

# View reasoning process
reasoning_tokens = response.usage.output_tokens_details.reasoning_tokens
print(f"Reasoning tokens used: {reasoning_tokens}")

# Continue conversation, reasoning context will persist
follow_up = client.responses.create(
    model="o3",
    input="Now what if the train slowed down 10% in the fourth hour?",
    previous_response_id=response.id
)

🆚 Comparison with Chat Completions

Feature	Chat Completions	Responses API
Basic Conversation	✅ Supported	✅ Supported
Streaming Response	✅ Supported	✅ Supported
Function Calling	✅ Supported	✅ Enhanced Support
Built-in Tools	❌ Not Supported	✅ Rich Tools
State Management	❌ Stateless	✅ Stateful
Reasoning Persistence	❌ Not Supported	✅ O3/O4 Support
File Search	❌ Not Supported	✅ Supported
Code Interpreter	❌ Not Supported	✅ Supported

Migration Example

Migrating from Chat Completions to Responses API:

# Old method
response = client.chat.completions.create(
    model="gpt-4.1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
content = response.choices[0].message.content

🔧 Advanced Features

Parallel Tool Calling

response = client.responses.create(
    model="gpt-4.1",
    input="Get weather for Beijing and Shanghai, then calculate travel time between them",
    instructions="You are a travel assistant.",
    parallel_tool_calls=True,
    tools=[
        {"type": "function", "function": {"name": "get_weather", ...}},
        {"type": "function", "function": {"name": "calculate_distance", ...}}
    ]
)

Output Format Control

response = client.responses.create(
    model="gpt-4.1",
    input="Summarize this data in JSON format",
    instructions="Always respond in valid JSON.",
    text={
        "format": {
            "type": "json_object"
        }
    }
)

Reasoning Effort Control (O3 Series)

response = client.responses.create(
    model="o3",
    input="Solve this complex physics problem",
    instructions="Think carefully and show detailed reasoning.",
    reasoning={
        "effort": "high"  # low, medium, high
    }
)

📊 Response Fields Explained

Core Fields

Field	Type	Description
`id`	string	Unique response identifier
`object`	string	Fixed as “response”
`created_at`	integer	Creation timestamp
`status`	string	Status: completed/failed/in_progress
`model`	string	Actual model version used
`output`	array	Output message array
`usage`	object	Token usage statistics

Output Message Format

{
  "id": "msg_xxx",
  "type": "message",
  "status": "completed",
  "content": [
    {
      "type": "output_text",
      "text": "Response content",
      "annotations": [],
      "logprobs": []
    }
  ],
  "role": "assistant"
}

Usage Statistics

{
  "usage": {
    "input_tokens": 19,
    "input_tokens_details": {
      "cached_tokens": 0
    },
    "output_tokens": 10,
    "output_tokens_details": {
      "reasoning_tokens": 0  // Reasoning models only
    },
    "total_tokens": 29
  }
}

🚨 Error Handling

Standard Error Format

{
  "error": {
    "type": "invalid_request_error",
    "code": "model_not_supported",
    "message": "The model 'gpt-3.5-turbo' is not supported for the responses endpoint.",
    "param": "model"
  }
}

Common Errors

Error Code	Description	Solution
`model_not_supported`	Model doesn’t support Responses API	Use supported newer models
`invalid_previous_response_id`	Invalid previous response ID	Check if response ID is correct
`tool_not_available`	Tool not available	Check tool configuration
`max_tokens_exceeded`	Token limit exceeded	Reduce input or set max_output_tokens

💡 Best Practices

1. State Management Strategy

class ConversationManager:
    def __init__(self, model="gpt-4.1", instructions="You are a helpful assistant."):
        self.model = model
        self.instructions = instructions
        self.last_response_id = None

    def send_message(self, input_text, tools=None):
        response = client.responses.create(
            model=self.model,
            input=input_text,
            instructions=self.instructions,
            previous_response_id=self.last_response_id,
            tools=tools or []
        )

        self.last_response_id = response.id
        return response.output[0].content[0].text

    def reset_conversation(self):
        self.last_response_id = None

# Usage example
conv = ConversationManager()
print(conv.send_message("Hello, I'm Alice"))
print(conv.send_message("What's my name?"))  # Will remember Alice

2. Tool Calling Optimization

def smart_tool_calling(user_input):
    # Intelligently select tools based on input
    available_tools = []

    if "weather" in user_input.lower():
        available_tools.append(weather_tool)
    if "calculate" in user_input.lower():
        available_tools.append(calculator_tool)
    if "search" in user_input.lower():
        available_tools.append(search_tool)

    response = client.responses.create(
        model="gpt-4.1",
        input=user_input,
        instructions="Use the appropriate tools to help the user.",
        tools=available_tools,
        tool_choice="auto"
    )

    return response

3. Reasoning Model Optimization

def optimized_reasoning(complex_problem):
    response = client.responses.create(
        model="o3",
        input=complex_problem,
        instructions="Think step by step and show your reasoning process.",
        reasoning={
            "effort": "high"  # Use high reasoning effort for complex problems
        },
        temperature=0.1  # Lower randomness for consistent results
    )

    # Analyze reasoning usage
    reasoning_tokens = response.usage.output_tokens_details.reasoning_tokens
    total_cost = calculate_cost(response.usage)

    return {
        "answer": response.output[0].content[0].text,
        "reasoning_tokens": reasoning_tokens,
        "cost": total_cost
    }

🔮 Future Development

Upcoming Features

Full Assistants API feature integration (H1 2026)
More built-in tools: Web search, computer use, etc.
Model Context Protocol (MCP) support
Enhanced multimodal capabilities

Migration Timeline

Now: Can start using Responses API
H1 2026: Feature parity with Assistants API
2026: Assistants API deprecation announcement
2027: Complete migration to Responses API

Development Recommendation: New projects recommended to use Responses API directly, existing projects can migrate gradually. APIYI will continue to follow OpenAI updates to ensure feature completeness.

Need more help? Visit APIYI Website or check OpenAI Responses API Official Documentation.

Basics

API Capabilities

​🚀 Core Features

Built-in Tool Support

State Management

Reasoning Persistence

Full Compatibility

​📋 Supported Models

​Reasoning Models (Recommended)

​Chat Models

​🔧 Basic Usage

​Simple Conversation

​Actual Response Example

​📊 Request Parameters Explained

​Required Parameters

​Optional Parameters

​🛠️ Built-in Tool Support

​1. Function Calling

​2. Code Interpreter

​3. File Search

​🔄 State Management

​Maintaining Conversation Context

​Multi-turn Tool Calling

​📈 Reasoning Model Features

​O3/O4-mini Reasoning Persistence

​🆚 Comparison with Chat Completions

​Migration Example

​🔧 Advanced Features

​Parallel Tool Calling

​Output Format Control

​Reasoning Effort Control (O3 Series)

​📊 Response Fields Explained

​Core Fields

​Output Message Format

​Usage Statistics

​🚨 Error Handling

​Standard Error Format

​Common Errors

​💡 Best Practices

​1. State Management Strategy

​2. Tool Calling Optimization

​3. Reasoning Model Optimization

​🔮 Future Development

​Upcoming Features

​Migration Timeline

🚀 Core Features

📋 Supported Models

Reasoning Models (Recommended)

Chat Models

🔧 Basic Usage

Simple Conversation

Actual Response Example

📊 Request Parameters Explained

Required Parameters

Optional Parameters

🛠️ Built-in Tool Support

1. Function Calling

2. Code Interpreter

3. File Search

🔄 State Management

Maintaining Conversation Context

Multi-turn Tool Calling

📈 Reasoning Model Features

O3/O4-mini Reasoning Persistence

🆚 Comparison with Chat Completions

Migration Example

🔧 Advanced Features

Parallel Tool Calling

Output Format Control

Reasoning Effort Control (O3 Series)

📊 Response Fields Explained

Core Fields

Output Message Format

Usage Statistics

🚨 Error Handling

Standard Error Format

Common Errors

💡 Best Practices

1. State Management Strategy

2. Tool Calling Optimization

3. Reasoning Model Optimization

🔮 Future Development

Upcoming Features

Migration Timeline