openai · OpenAI Platform Docs
Realtime transcription
Implements live speech-to-text streaming using the Realtime API by configuring transcription sessions, streaming audio chunks via WebSockets or WebRTC, and handling incremental transcript delta and completion events.
Derived skill
Files assembled from official documentation
Viewing SKILL.md
Realtime transcription
Implements live speech-to-text streaming using the Realtime API by configuring transcription sessions, streaming audio chunks via WebSockets or WebRTC, and handling incremental transcript delta and completion events.
When To Use
Use when implementing live captioning or real-time speech-to-text features that require streaming text deltas as audio arrives.
Reference Files
| File | Contains | Use For |
|---|---|---|
SKILL.md | Entry point: scope, routing table, and workflow. | Start here. |
docs/realtime-transcription-workflow-guide.md | A guide explaining how to create transcription sessions, configure session fields, and stream audio for live speech-to-text. | Questions about a guide explaining how to create transcription sessions, configure session fields, and stream audio for live speech-t... |
examples/realtime-transcription-openai-realtime-transcription-session-update.json | A JSON configuration object for updating a realtime session with transcription settings and audio input parameters. | Exact payloads, commands, or snippets shown in A JSON configuration object for updating a realtime session with transcription settings and audio input parameters. |
examples/realtime-transcription-openai-realtime-transcription-javascript-input-au.javascript | A JavaScript code example demonstrating how to send base64 encoded PCM16 audio buffers to the OpenAI Realtime API using a WebSocket connection. | Exact payloads, commands, or snippets shown in A JavaScript code example demonstrating how to send base64 encoded PCM16 audio buffers to the OpenAI Realtime API usi... |
examples/realtime-transcription-openai-realtime-transcription.javascript | A JavaScript code example demonstrating how to send input audio buffer commit events via WebSocket for realtime transcription using the OpenAI API. | Exact payloads, commands, or snippets shown in A JavaScript code example demonstrating how to send input audio buffer commit events via WebSocket for realtime trans... |
examples/realtime-transcription-openai-realtime-transcription-javascript-event-ha.javascript | A JavaScript code snippet demonstrating how to handle and process real-time audio transcription delta and completion events via a WebSocket connection. | Exact payloads, commands, or snippets shown in A JavaScript code snippet demonstrating how to handle and process real-time audio transcription delta and completion... |
examples/realtime-transcription-openai-realtime-transcription-audio-delta-event.json | A JSON object representing a conversation item input audio transcription delta event for the OpenAI Realtime API. | Exact payloads, commands, or snippets shown in A JSON object representing a conversation item input audio transcription delta event for the OpenAI Realtime API. |
examples/realtime-transcription-openai-realtime-transcription-event.json | A JSON object representing a conversation item input audio transcription completed event. | Exact payloads, commands, or snippets shown in A JSON object representing a conversation item input audio transcription completed event. |
examples/realtime-transcription-openai-realtime-transcription-keywords.text | A text file containing a list of extracted medical keywords from a realtime transcription session. | Exact payloads, commands, or snippets shown in A text file containing a list of extracted medical keywords from a realtime transcription session. |
examples/realtime-transcription-openai-realtime-transcription-session-update-2.json | A JSON object configuring the session update parameters for realtime transcription including model selection and logprobs inclusion. | Exact payloads, commands, or snippets shown in A JSON object configuring the session update parameters for realtime transcription including model selection and logp... |
What This Skill Covers
- Use realtime transcription when your application needs live speech-to-text without a spoken assistant response. Realtime transcription sessions stream transc...
- Main sections:
Choose a transcription model,Create a transcription session,Session fields,Stream audio,Handle transcript events.
Workflow
- Open the most relevant file under
docs/for the exact documented workflow and wording. - Open
schemas/files for exact structured contracts. - Open
examples/files for concrete requests, commands, snippets, and manifests. - Do not add behavior or configuration that is not present in the attached source files.
Canonical source: https://developers.openai.com/api/docs/guides/realtime-transcription.md
