google · Google AI Docs
Gemini API Priority inference
Explains how to use priority inference to optimize model performance and latency by identifying and prioritizing specific information within a prompt.
Derived skill
Files assembled from official documentation
Viewing SKILL.md
Gemini API Priority inference
Explains how to use priority inference to optimize model performance and latency by identifying and prioritizing specific information within a prompt.
When To Use
Use when you need to optimize model latency or accuracy by directing the model's attention to specific parts of a large or complex prompt context.
Reference Files
| File | Contains | Use For |
|---|---|---|
SKILL.md | Entry point: scope, routing table, and workflow. | Start here. |
docs/gemini-api-priority-inference-workflow-guide.md | A guide explaining how to use the Gemini API Priority inference tier with Python, JavaScript, and Go implementations. | Questions about a guide explaining how to use the Gemini API Priority inference tier with Python, JavaScript, and Go implementations. |
examples/gemini-api-priority-inference-python-generatecontent.text | A Python code example demonstrating how to use the servicetier configuration in a generateContent request to request priority inference via the Gemini API. | Exact payloads, commands, or snippets shown in A Python code example demonstrating how to use the servicetier configuration in a generateContent request to request... |
examples/gemini-api-priority-inference-nodejs.text | A Node.js code example demonstrating how to use the serviceTier configuration to request priority inference with the Gemini API. | Exact payloads, commands, or snippets shown in A Node.js code example demonstrating how to use the serviceTier configuration to request priority inference with the... |
examples/gemini-api-priority-inference-go-generatecontent.text | A Go program demonstrating how to use the Gemini API to perform priority inference on a customer support ticket. | Exact payloads, commands, or snippets shown in A Go program demonstrating how to use the Gemini API to perform priority inference on a customer support ticket. |
examples/gemini-api-priority-inference-curl-request.text | A curl command demonstrating how to send a POST request to the Gemini API with the servicetier parameter set to priority. | Exact payloads, commands, or snippets shown in A curl command demonstrating how to send a POST request to the Gemini API with the servicetier parameter set to prior... |
What This Skill Covers
- Preview: The Gemini Priority API is in Preview.
- Main sections:
How to use Priority,Python,JavaScript,Go,REST.
Workflow
- Open the most relevant file under
docs/for the exact documented workflow and wording. - Open
schemas/files for exact structured contracts. - Open
examples/files for concrete requests, commands, snippets, and manifests. - Do not add behavior or configuration that is not present in the attached source files.
Canonical source: https://ai.google.dev/gemini-api/docs/priority-inference
