openai · OpenAI Platform Docs
Evaluation best practices | OpenAI API
Provides strategic guidance and methodologies for designing, implementing, and interpreting evaluation frameworks for LLM-based applications.
Derived skill
Files assembled from official documentation
Viewing SKILL.md
Evaluation best practices | OpenAI API
Provides strategic guidance and methodologies for designing, implementing, and interpreting evaluation frameworks for LLM-based applications.
When To Use
Use when designing an evaluation pipeline to measure model accuracy, reliability, or performance improvements in an LLM application.
Reference Files
| File | Contains | Use For |
|---|---|---|
SKILL.md | Entry point: scope, routing table, and workflow. | Start here. |
docs/evaluation-best-practices-openai-api-workflow-guide.md | A guide outlining methodologies for testing generative AI systems, including types of evaluations and designing an evaluation process. | Questions about a guide outlining methodologies for testing generative AI systems, including types of evaluations and designing an ev... |
What This Skill Covers
- Generative AI is variable. Models sometimes produce different output from the same input, which makes traditional software testing methods insufficient for A...
- Main sections:
What are evals?,Types of evals,How to read evals,Design your eval process,Example: Summarizing transcripts.
Workflow
- Open the most relevant file under
docs/for the exact documented workflow and wording. - Open
schemas/files for exact structured contracts. - Open
examples/files for concrete requests, commands, snippets, and manifests. - Do not add behavior or configuration that is not present in the attached source files.
Canonical source: https://developers.openai.com/api/docs/guides/evaluation-best-practices
