Effort levels
Reasoning models expose a “how hard should I think” dial. Every API spells it differently. OpenGateway normalizes a single internal enum and translates it to whatever the upstream model expects — regardless of which inbound API you used.
The normalized enum
Section titled “The normalized enum”off · minimal · low · medium · high · maxThe default is roughly medium. If the upstream model does not support an effort knob, the gateway drops it silently — it never hard-fails a coding client over an unsupported parameter.
How each API signals effort
Section titled “How each API signals effort”| Tier | OpenAI Chat | OpenAI Responses | Anthropic (4.6+) | Anthropic (legacy ≤4.5) |
| --- | --- | --- | --- | --- |
| Off | reasoning_effort: "none" | reasoning.effort: "none" | omit thinking | thinking.type: "disabled" |
| Minimal | "minimal" | "minimal" | thinking.type: "adaptive" + output_config.effort: "low" | thinking.budget_tokens: ~1–2k |
| Low | "low" | "low" | output_config.effort: "low" | ~4–8k |
| Medium | "medium" | "medium" | output_config.effort: "medium" | ~8–16k |
| High | "high" | "high" | output_config.effort: "high" | ~16–32k |
| Max | "xhigh" | "xhigh" | output_config.effort: "xhigh"/"max" | ~32–64k (< max_tokens) |
Setting effort
Section titled “Setting effort”await client.chat.completions.create({ model: 'claude-turbo-hub-qwen3-coder', messages: [{ role: 'user', content: 'Find the bug.' }], reasoning_effort: 'high',});await client.responses.create({ model: 'claude-turbo-hub-gpt-oss-120b', input: 'Diagnose this stack trace.', reasoning: { effort: 'high' },});await client.messages.create({ model: 'claude-turbo-hub-qwen3-coder', max_tokens: 1024, thinking: { type: 'adaptive' }, output_config: { effort: 'high' }, messages: [{ role: 'user', content: 'Review this diff.' }],});Cross-API translation
Section titled “Cross-API translation”The point of the normalized enum is that effort survives a protocol mismatch:
- Claude Code (Anthropic
thinking) → routed to an HF chat model ⇒ emitted as OpenAIreasoning_effort. - Codex (OpenAI Responses
reasoning.effort) → routed to a Claude frontier model ⇒ emitted as Anthropicthinking/output_config.effort.
The branded SDK exposes a single effort option that
maps to whichever wire your call uses.
Ask about configuring OpenGateway — lanes, base URLs, client setup, model choice, or an error you hit. Answers are grounded in the docs.