openai · OpenAI Platform Docs
Voice activity detection (VAD) | OpenAI API
Explains how to implement and configure Voice Activity Detection (VAD) within the OpenAI Realtime API to automatically detect when a user starts and stops speaking.
Derived skill
Files assembled from official documentation
Viewing SKILL.md
Voice activity detection (VAD) | OpenAI API
Explains how to implement and configure Voice Activity Detection (VAD) within the OpenAI Realtime API to automatically detect when a user starts and stops speaking.
When To Use
Use when you need to implement automatic speech detection to manage turn-taking in real-time voice applications without manual trigger mechanisms.
Reference Files
| File | Contains | Use For |
|---|---|---|
SKILL.md | Entry point: scope, routing table, and workflow. | Start here. |
docs/voice-activity-detection-vad-openai-api-workflow-guide.md | A guide explaining server-side and semantic voice activity detection (VAD) features within the OpenAI Realtime API. | Questions about a guide explaining server-side and semantic voice activity detection (VAD) features within the OpenAI Realtime API. |
examples/voice-activity-detection-vad-openai-api-openai-realtime-api-vad-session-.text | A JSON configuration object for the session.update event to define server-side voice activity detection parameters like threshold and silence duration. | Exact payloads, commands, or snippets shown in A JSON configuration object for the session.update event to define server-side voice activity detection parameters li... |
examples/voice-activity-detection-vad-openai-api-openai-realtime-api-vad-session--2.text | A JSON configuration object for the session.update event to define semantic voice activity detection parameters. | Exact payloads, commands, or snippets shown in A JSON configuration object for the session.update event to define semantic voice activity detection parameters. |
What This Skill Covers
- Voice activity detection (VAD) is a feature available in the Realtime API allowing to automatically detect when the user has started or stopped speaking. It...
- Main sections:
Overview,Server VAD,Semantic VAD.
Workflow
- Open the most relevant file under
docs/for the exact documented workflow and wording. - Open
schemas/files for exact structured contracts. - Open
examples/files for concrete requests, commands, snippets, and manifests. - Do not add behavior or configuration that is not present in the attached source files.
Canonical source: https://developers.openai.com/api/docs/guides/realtime-vad
