OpenAI Chat Completions Compatible Mode

/v1/chat/completions is the de facto standard interface of the LLM industry — virtually every framework, client, and SDK supports it out of the box. Through APIYI, this single endpoint reaches OpenAI, Claude, Gemini, DeepSeek, and 400+ models in total; switching models is just swapping a string.

Which endpoint to pick: using existing frameworks/clients, or want one codebase across multiple vendors → compatible mode (this page); need built-in tools (web search, code interpreter) or Pro-series models → Native Calls (/v1/responses). OpenAI’s official stance on Chat Completions: supported long-term, but Responses is recommended for new projects. Both endpoints require you to maintain conversation history yourself — see the Multi-Turn Conversation Guide.

Quick Start

curl https://api.apiyi.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {"role": "user", "content": "Introduce yourself in one sentence"}
    ]
  }'

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyi.com/v1"
)

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "Introduce yourself in one sentence"}]
)

print(response.choices[0].message.content)

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'YOUR_API_KEY',
  baseURL: 'https://api.apiyi.com/v1'
});

const response = await openai.chat.completions.create({
  model: 'gpt-5.4',
  messages: [{ role: 'user', content: 'Introduce yourself in one sentence' }]
});

console.log(response.choices[0].message.content);

One Interface, Every Provider

This is the biggest payoff of compatible mode: switching models means changing a string — not a line of code.

def ask(message: str, model: str) -> str:
    response = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": message}]
    )
    return response.choices[0].message.content

print(ask("Explain quantum entanglement", "gpt-5.4"))               # OpenAI
print(ask("Explain quantum entanglement", "claude-sonnet-4-6"))      # Anthropic
print(ask("Explain quantum entanglement", "gemini-3-pro-preview"))   # Google
print(ask("Explain quantum entanglement", "deepseek-chat"))          # DeepSeek

Full model names and prices: Models & Pricing. Note: calling Claude through the compatible format forfeits Claude’s Prompt Cache discount — for heavy Claude usage, use Claude Native Calls.

SDK Setup per Language

Every official SDK supports a custom base_url — configure once and go.

Python

pip install openai

from openai import OpenAI, AsyncOpenAI

# Synchronous client
client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyi.com/v1"
)

# Async client
async_client = AsyncOpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyi.com/v1"
)

Or use environment variables for zero in-code config:

export OPENAI_API_KEY="YOUR_API_KEY"
export OPENAI_BASE_URL="https://api.apiyi.com/v1"

from openai import OpenAI
client = OpenAI()  # reads the environment automatically

Node.js / TypeScript

npm install openai

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: 'https://api.apiyi.com/v1'
});

const response = await openai.chat.completions.create({
  model: 'gpt-5.4-mini',
  messages: [{ role: 'user', content: 'Hello!' }],
  temperature: 0.7
});

.NET

dotnet add package OpenAI

using OpenAI;
using OpenAI.Chat;

var client = new OpenAIClient(
    new System.ClientModel.ApiKeyCredential("YOUR_API_KEY"),
    new OpenAIClientOptions { Endpoint = new Uri("https://api.apiyi.com/v1") }
);

var chatClient = client.GetChatClient("gpt-5.4");
var response = await chatClient.CompleteChatAsync("Hello!");
Console.WriteLine(response.Value.Content[0].Text);

Go

Use the official OpenAI Go SDK (github.com/openai/openai-go):

go get github.com/openai/openai-go

package main

import (
    "context"
    "fmt"

    "github.com/openai/openai-go"
    "github.com/openai/openai-go/option"
)

func main() {
    client := openai.NewClient(
        option.WithAPIKey("YOUR_API_KEY"),
        option.WithBaseURL("https://api.apiyi.com/v1"),
    )

    completion, err := client.Chat.Completions.New(context.TODO(), openai.ChatCompletionNewParams{
        Model: "gpt-5.4",
        Messages: []openai.ChatCompletionMessageParamUnion{
            openai.UserMessage("Hello!"),
        },
    })
    if err != nil {
        panic(err)
    }
    fmt.Println(completion.Choices[0].Message.Content)
}

Java

Use the official OpenAI Java SDK (com.openai:openai-java):

<dependency>
    <groupId>com.openai</groupId>
    <artifactId>openai-java</artifactId>
    <version>LATEST</version>
</dependency>

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.chat.completions.ChatCompletion;
import com.openai.models.chat.completions.ChatCompletionCreateParams;

OpenAIClient client = OpenAIOkHttpClient.builder()
    .apiKey("YOUR_API_KEY")
    .baseUrl("https://api.apiyi.com/v1")
    .build();

ChatCompletionCreateParams params = ChatCompletionCreateParams.builder()
    .model("gpt-5.4")
    .addUserMessage("Hello!")
    .build();

ChatCompletion completion = client.chat().completions().create(params);
System.out.println(completion.choices().get(0).message().content().orElse(""));

Legacy projects on third-party libraries (Go’s sashabaranov/go-openai, Java’s theokanning packages) still work after changing the base_url, but we recommend migrating to the official SDKs above — third-party libraries lag on new parameters such as reasoning_effort.

Common Features

Streaming

stream = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "Write a short poem about autumn"}],
    stream=True
)

for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Reasoning control

On Chat Completions, use the top-level reasoning_effort parameter (different from the nested form on Responses):

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "Prove that the square root of 2 is irrational"}],
    reasoning_effort="high"  # none / low / medium / high / xhigh
)

On GPT-5.4 and later (including the gpt-5.6 series), tools and reasoning_effort are mutually exclusive on this endpoint: carrying tools while reasoning_effort is not none fails with a 400 — Function tools with reasoning_effort are not supported for ... in /v1/chat/completions. Omitting the parameter doesn’t help, since it defaults to medium. This is an official OpenAI restriction — for reasoning plus tool calling, switch to the Responses endpoint, or explicitly set reasoning_effort="none".

gpt-5 series reasoning models do not support temperature / top_p on this endpoint either — passing them raises an error.

Image input

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
            ]
        }
    ]
)

Embeddings

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="Text to embed"
)
embedding = response.data[0].embedding

Error Handling and Retries

The official SDKs retry automatically (2 attempts by default, on 429 / 5xx / connection errors) — prefer that over hand-rolled loops:

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.apiyi.com/v1",
    max_retries=3,   # built-in exponential backoff
    timeout=60.0
)

For finer control, catch by exception type:

from openai import (
    APIError,
    APIConnectionError,
    RateLimitError,
    InternalServerError,
)

try:
    response = client.chat.completions.create(
        model="gpt-5.4",
        messages=[{"role": "user", "content": "Hello"}]
    )
except RateLimitError:
    print("Rate limited — retry later")
except APIConnectionError:
    print("Connection error — check network/proxy")
except InternalServerError:
    print("Upstream error — worth retrying")
except APIError as e:
    print(f"API error: {e}")

Capability Boundaries of Compatible Mode

Capability	Compatible mode	Notes
Chat / streaming / multimodal input	✅	Fully supported
Function calling (FC)	✅	See Function Calling
Prompt cache discount	✅	Automatic for OpenAI models — see Cache Billing
Built-in tools (web search, code interpreter, …)	❌	Native Calls only
Multi-turn conversation	✅	Maintain `messages` history yourself (native Responses also requires self-managed history — see the Multi-Turn Guide)
`verbosity` output control	❌	Native only
Pro-series models (gpt-5.4-pro, …)	❌	In practice, native calls only

Migrating from OpenAI Direct

Already on OpenAI’s official service? Migration is two steps with zero code changes:

Change base_url and key

# Before
client = OpenAI(api_key="sk-...")

# After
client = OpenAI(
    api_key="YOUR_APIYI_KEY",
    base_url="https://api.apiyi.com/v1"
)

Or change environment variables only (code untouched)

export OPENAI_API_KEY="YOUR_APIYI_KEY"
export OPENAI_BASE_URL="https://api.apiyi.com/v1"

Method calls, parameter formats, and response structures all stay identical.

This group: Native Calls · Cache Billing · Function Calling
Models & pricing: Models & Pricing
Get / manage tokens: https://api.apiyi.com/token
Official OpenAI SDK list: platform.openai.com/docs/libraries

Basics

Basic API

Image API (Official)

Video API (Official)

Multimodal Understanding API

Text API

OpenAI Chat Completions Compatible Mode

Quick Start

One Interface, Every Provider

SDK Setup per Language

Python

Node.js / TypeScript

.NET

Go

Java

Common Features

Streaming

Reasoning control

Image input

Embeddings

Error Handling and Retries

Capability Boundaries of Compatible Mode

Migrating from OpenAI Direct

​Quick Start

​One Interface, Every Provider

​SDK Setup per Language

​Python

​Node.js / TypeScript

​.NET

​Go

​Java

​Common Features

​Streaming

​Reasoning control

​Image input

​Embeddings

​Error Handling and Retries

​Capability Boundaries of Compatible Mode

​Migrating from OpenAI Direct

​Related Links

Quick Start

One Interface, Every Provider

SDK Setup per Language

Python

Node.js / TypeScript

.NET

Go

Java

Common Features

Streaming

Reasoning control

Image input

Embeddings

Error Handling and Retries

Capability Boundaries of Compatible Mode

Migrating from OpenAI Direct

Related Links