Prompt Buddy logoPrompt Buddy

openai · OpenAI Platform Docs

Evaluation best practices | OpenAI API

Provides strategic guidance and methodologies for designing, implementing, and interpreting evaluation frameworks for LLM-based applications.

Import to Prompt Buddy

Derived skill

Files assembled from official documentation

Viewing SKILL.md

Evaluation best practices | OpenAI API

Provides strategic guidance and methodologies for designing, implementing, and interpreting evaluation frameworks for LLM-based applications.

When To Use

Use when designing an evaluation pipeline to measure model accuracy, reliability, or performance improvements in an LLM application.

Reference Files

FileContainsUse For
SKILL.mdEntry point: scope, routing table, and workflow.Start here.
docs/evaluation-best-practices-openai-api-workflow-guide.mdA guide outlining methodologies for testing generative AI systems, including types of evaluations and designing an evaluation process.Questions about a guide outlining methodologies for testing generative AI systems, including types of evaluations and designing an ev...

What This Skill Covers

  • Generative AI is variable. Models sometimes produce different output from the same input, which makes traditional software testing methods insufficient for A...
  • Main sections: What are evals?, Types of evals, How to read evals, Design your eval process, Example: Summarizing transcripts.

Workflow

  1. Open the most relevant file under docs/ for the exact documented workflow and wording.
  2. Open schemas/ files for exact structured contracts.
  3. Open examples/ files for concrete requests, commands, snippets, and manifests.
  4. Do not add behavior or configuration that is not present in the attached source files.

Canonical source: https://developers.openai.com/api/docs/guides/evaluation-best-practices