Responses API Contract
Agnitra uses the OpenAI Responses API to translate profiler telemetry into kernel tuning recommendations. This reference covers request schemas, tooling payloads, and operational guardrails.Base URL
Authentication
| Header | Value |
|---|---|
Authorization | Bearer <OPENAI_API_KEY> |
OpenAI-Project | Optional. Overrides the project context associated with the API key. |
OpenAI-Organization | Optional. Use when the key belongs to multiple organisations. |
Request Schema
modelmust reference a Responses-capable deployment (e.g.gpt-5-codexorgpt-5-mini). Do not use deprecated parameters such astemperatureormax_output_tokens.inputaccepts text or image content. Agnitra sends JSON-formatted telemetry snippets.toolsunlock structured responses via function calling; strict schemas avoid invalid payloads.metadatacaptures attribution for usage metering and billing.
Response Structure
- Parse
tool_callsand validate against the schema before mutating kernels. usage.total_tokensfeeds into Agnitra’s cost telemetry pipeline.
Rate Limits & Diagnostics
- Inspect
x-ratelimit-*headers to understand remaining token/request budgets. - Log the
x-request-idheader for each call to accelerate support escalation. - Retry with exponential backoff on
429and5xxresponses; avoid retrying validation errors.
Error Handling
| Status | Typical Cause | Resolution |
|---|---|---|
400 Bad Request | Invalid JSON schema or tool payload. | Inspect error body, correct request data. |
401 Unauthorized | API key missing/invalid. | Refresh credentials and retry. |
403 Forbidden | Project/org mismatch. | Confirm OpenAI-Project/OpenAI-Organization headers. |
429 Too Many Requests | Rate limit exceeded. | Use Retry-After header and exponential backoff. |
500/503 | Upstream transient error. | Retry with jitter (max 3 attempts). |
Security Checklist
- Rotate API keys regularly and scope them to the minimum required access.
- Sanitize telemetry content to avoid leaking customer identifiers.
- Validate every tool-call payload against the defined JSON schema prior to execution.
Related Documentation
- Runtime Configuration — environment variables that toggle Responses API usage.
- SDK & CLI Guide — how responses inform local optimisation flows.
- Architecture Overview — where Responses API fits in the optimization loop.