openai · OpenAI Platform Docs

Realtime and audio

A guide for selecting and implementing different audio architectures, including voice-agent, translation, and transcription sessions using the Realtime API.

Import to Prompt Buddy

Derived skill

Files assembled from official documentation

Viewing SKILL.md

Realtime and audio

A guide for selecting and implementing different audio architectures, including voice-agent, translation, and transcription sessions using the Realtime API.

When To Use

Use when deciding between realtime sessions and request-based audio APIs to build low-latency voice agents, continuous translators, or streaming transcription services.

Reference Files

File	Contains	Use For
`SKILL.md`	Entry point: scope, routing table, and workflow.	Start here.
`docs/realtime-and-audio-workflow-guide.md`	A guide explaining use cases, architectures, and session types for OpenAI realtime and audio APIs.	Questions about a guide explaining use cases, architectures, and session types for OpenAI realtime and audio APIs.

What This Skill Covers

Start with the outcome you want to build. Realtime sessions are best for live audio that needs low latency. Request-based audio APIs are best for files, boun...
Main sections: Common use cases, Understand different architectures, Choose a realtime session, Voice-agent sessions, Translation sessions.

Workflow

Open the most relevant file under docs/ for the exact documented workflow and wording.
Open schemas/ files for exact structured contracts.
Open examples/ files for concrete requests, commands, snippets, and manifests.
Do not add behavior or configuration that is not present in the attached source files.

Canonical source: https://developers.openai.com/api/docs/guides/realtime.md