openai · OpenAI Platform Docs
Realtime and audio
A guide for selecting and implementing different audio architectures, including voice-agent, translation, and transcription sessions using the Realtime API.
Derived skill
Files assembled from official documentation
Viewing SKILL.md
Realtime and audio
A guide for selecting and implementing different audio architectures, including voice-agent, translation, and transcription sessions using the Realtime API.
When To Use
Use when deciding between realtime sessions and request-based audio APIs to build low-latency voice agents, continuous translators, or streaming transcription services.
Reference Files
| File | Contains | Use For |
|---|---|---|
SKILL.md | Entry point: scope, routing table, and workflow. | Start here. |
docs/realtime-and-audio-workflow-guide.md | A guide explaining use cases, architectures, and session types for OpenAI realtime and audio APIs. | Questions about a guide explaining use cases, architectures, and session types for OpenAI realtime and audio APIs. |
What This Skill Covers
- Start with the outcome you want to build. Realtime sessions are best for live audio that needs low latency. Request-based audio APIs are best for files, boun...
- Main sections:
Common use cases,Understand different architectures,Choose a realtime session,Voice-agent sessions,Translation sessions.
Workflow
- Open the most relevant file under
docs/for the exact documented workflow and wording. - Open
schemas/files for exact structured contracts. - Open
examples/files for concrete requests, commands, snippets, and manifests. - Do not add behavior or configuration that is not present in the attached source files.
Canonical source: https://developers.openai.com/api/docs/guides/realtime.md
