Prompt Buddy logoPrompt Buddy

openai · OpenAI Platform Docs

Audio and speech | OpenAI API

Explains how to implement audio-related capabilities including speech-to-text transcription, text-to-speech synthesis, and real-time audio interactions using OpenAI models.

Import to Prompt Buddy

Derived skill

Files assembled from official documentation

Viewing SKILL.md

Audio and speech | OpenAI API

Explains how to implement audio-related capabilities including speech-to-text transcription, text-to-speech synthesis, and real-time audio interactions using OpenAI models.

When To Use

Use when implementing features such as automated transcription, voice synthesis, or real-time conversational audio interfaces.

Reference Files

FileContainsUse For
SKILL.mdEntry point: scope, routing table, and workflow.Start here.
docs/audio-and-speech-openai-api-workflow-guide.mdA guide explaining audio modalities, speech tasks, streaming, and the differences between request-based APIs and realtime sessions for OpenAI audio models.Questions about a guide explaining audio modalities, speech tasks, streaming, and the differences between request-based APIs and real...
examples/audio-and-speech-openai-api-openai-realtime-agent-session-nodejs.textA JavaScript code example demonstrating how to initialize a RealtimeAgent and connect to a RealtimeSession using the OpenAI API.Exact payloads, commands, or snippets shown in A JavaScript code example demonstrating how to initialize a RealtimeAgent and connect to a RealtimeSession using the...
examples/audio-and-speech-openai-api-openai-api-chat-completions-audio-modalities.textA Node.js code example demonstrating how to generate an audio response using the chat completions endpoint with text and audio modalities.Exact payloads, commands, or snippets shown in A Node.js code example demonstrating how to generate an audio response using the chat completions endpoint with text...
examples/audio-and-speech-openai-api-openai-api-gpt-audio-multimodal-python.textA Python code example demonstrating how to use the gpt-audio model with text and audio modalities to generate a response.Exact payloads, commands, or snippets shown in A Python code example demonstrating how to use the gpt-audio model with text and audio modalities to generate a respo...
examples/audio-and-speech-openai-api-openai-api-chat-completions-audio-modalities-2.textA curl command demonstrating how to request text and audio modalities using the gpt-audio model via the OpenAI Chat Completions API.Exact payloads, commands, or snippets shown in A curl command demonstrating how to request text and audio modalities using the gpt-audio model via the OpenAI Chat C...
examples/audio-and-speech-openai-api-openai-api-audio-transcription-nodejs.textA Node.js code example demonstrating how to fetch an audio file and prepare it for transcription using the OpenAI API.Exact payloads, commands, or snippets shown in A Node.js code example demonstrating how to fetch an audio file and prepare it for transcription using the OpenAI API.
examples/audio-and-speech-openai-api-openai-api-audio-transcription-python.textA Python script demonstrating how to fetch an audio file and prepare it for transcription using the OpenAI client.Exact payloads, commands, or snippets shown in A Python script demonstrating how to fetch an audio file and prepare it for transcription using the OpenAI client.
examples/audio-and-speech-openai-api-openai-api-chat-completions-audio-modalities-3.textA curl command demonstrating how to request text and audio modalities using the gpt-audio model via the OpenAI Chat Completions API.Exact payloads, commands, or snippets shown in A curl command demonstrating how to request text and audio modalities using the gpt-audio model via the OpenAI Chat C...

What This Skill Covers

  • Audio models can understand spoken input, generate spoken output, or do both in the same interaction. This guide explains the vocabulary used across OpenAI’s...
  • Main sections: Audio modalities, Common speech tasks, Streaming and latency, Request-based APIs and realtime sessions, Add audio to your existing application.

Workflow

  1. Open the most relevant file under docs/ for the exact documented workflow and wording.
  2. Open schemas/ files for exact structured contracts.
  3. Open examples/ files for concrete requests, commands, snippets, and manifests.
  4. Do not add behavior or configuration that is not present in the attached source files.

Canonical source: https://developers.openai.com/api/docs/guides/audio