Prompt Buddy logoPrompt Buddy

openai · OpenAI Platform Docs

Graders

Teaches how to implement and configure various grader types including string checks, text similarity, and model-based scoring to evaluate model performance against reference answers using JSON-based specifications and...

Import to Prompt Buddy

Derived skill

Files assembled from official documentation

Viewing SKILL.md

Graders

Teaches how to implement and configure various grader types including string checks, text similarity, and model-based scoring to evaluate model performance against reference answers using JSON-based specifications and...

When To Use

Use when you need to implement automated evaluation workflows to score model outputs using string matching, similarity metrics, or LLM-based grading.

Reference Files

FileContainsUse For
SKILL.mdEntry point: scope, routing table, and workflow.Start here.
docs/graders-workflow-guide.mdA guide explaining how to use the Graders API to evaluate model performance using templating and specific item and sample namespaces.Questions about a guide explaining how to use the Graders API to evaluate model performance using templating and specific item and sa...
examples/graders-openai-graders.jsonA JSON object demonstrating the structure of a grader configuration including reference answers and evaluation criteria.Exact payloads, commands, or snippets shown in A JSON object demonstrating the structure of a grader configuration including reference answers and evaluation criteria.
examples/graders-openai-graders-configuration.jsonA JSON configuration object defining multi-type grader rules for validating function names and arguments against reference outputs.Exact payloads, commands, or snippets shown in A JSON configuration object defining multi-type grader rules for validating function names and arguments against refe...
examples/graders-openai-graders-json-definition.jsonA JSON schema defining the structure for grader operations including string checks, comparison types, and reference inputs.Exact payloads, commands, or snippets shown in A JSON schema defining the structure for grader operations including string checks, comparison types, and reference i...
examples/graders-openai-graders-textsimilarity.jsonA JSON schema defining the structure for text similarity grading operations including evaluation metrics and pass thresholds.Exact payloads, commands, or snippets shown in A JSON schema defining the structure for text similarity grading operations including evaluation metrics and pass thr...
examples/graders-openai-graders-scoremodel.jsonA JSON schema defining the structure for a scoremodel object including input messages, model parameters, and pass thresholds for grading tasks.Exact payloads, commands, or snippets shown in A JSON schema defining the structure for a scoremodel object including input messages, model parameters, and pass thr...
examples/graders-openai-graders-2.jsonA JSON object demonstrating the structure for defining grader roles including system, developer, user, and assistant messages.Exact payloads, commands, or snippets shown in A JSON object demonstrating the structure for defining grader roles including system, developer, user, and assistant...
examples/graders-openai-graders-python-scoremodel-implementation.pythonA Python script demonstrating how to define a dummy scoremodel grader using the OpenAI API.Exact payloads, commands, or snippets shown in A Python script demonstrating how to define a dummy scoremodel grader using the OpenAI API.
examples/graders-openai-graders-3.jsonA JSON object demonstrating the structure of a grader output including result scores and reasoning steps.Exact payloads, commands, or snippets shown in A JSON object demonstrating the structure of a grader output including result scores and reasoning steps.
examples/graders-openai-graders-4.jsonA JSON object demonstrating the required schema for defining grader descriptions and conclusions.Exact payloads, commands, or snippets shown in A JSON object demonstrating the required schema for defining grader descriptions and conclusions.
examples/graders-openai-graders-comparison-logic.textA text-based example demonstrating the comparison logic for grading multiple answers using the graders guide.Exact payloads, commands, or snippets shown in A text-based example demonstrating the comparison logic for grading multiple answers using the graders guide.
examples/graders-openai-graders-model-grader-comparison-logic.textA text representation of the model_grader function logic used to compare multiple answers against a reference answer.Exact payloads, commands, or snippets shown in A text representation of the modelgrader function logic used to compare multiple answers against a reference answer.
examples/graders-openai-platform-docs-graders.jsonA JSON representation of a grader configuration example used for evaluating model outputs.Exact payloads, commands, or snippets shown in A JSON representation of a grader configuration example used for evaluating model outputs.
examples/graders-openai-graders-python-implementation.pythonA Python code example demonstrating how to implement a custom grading function for the OpenAI Graders feature.Exact payloads, commands, or snippets shown in A Python code example demonstrating how to implement a custom grading function for the OpenAI Graders feature.
examples/graders-openai-graders-5.jsonA JSON object demonstrating the structure of grader outputs including text, JSON, tools, and audio fields.Exact payloads, commands, or snippets shown in A JSON object demonstrating the structure of grader outputs including text, JSON, tools, and audio fields.
examples/graders-openai-graders-6.jsonA JSON object demonstrating the structure of a grader configuration including reference answers and evaluation keys.Exact payloads, commands, or snippets shown in A JSON object demonstrating the structure of a grader configuration including reference answers and evaluation keys.
examples/graders-openai-graders-python-implementation-2.pythonA Python script demonstrating how to implement a grading function using the OpenAI platform to evaluate sample outputs against reference answers.Exact payloads, commands, or snippets shown in A Python script demonstrating how to implement a grading function using the OpenAI platform to evaluate sample output...
examples/graders-openai-graders-python-implementation-3.pythonA Python code example demonstrating how to implement and use the Graders functionality within the OpenAI platform.Exact payloads, commands, or snippets shown in A Python code example demonstrating how to implement and use the Graders functionality within the OpenAI platform.
examples/graders-openai-platform-docs-graders-python-dependencies.textA list of required Python library versions including numpy, scipy, and scikit-learn for implementing graders.Exact payloads, commands, or snippets shown in A list of required Python library versions including numpy, scipy, and scikit-learn for implementing graders.
examples/graders-openai-graders-punkt-wordnet.textA text file containing sample data including punkt, stopwords, and wordnet metadata used for grading demonstrations.Exact payloads, commands, or snippets shown in A text file containing sample data including punkt, stopwords, and wordnet metadata used for grading demonstrations.
examples/graders-openai-platform-docs-graders-2.jsonA JSON example demonstrating the structure and schema for using the Graders feature within the OpenAI platform.Exact payloads, commands, or snippets shown in A JSON example demonstrating the structure and schema for using the Graders feature within the OpenAI platform.
examples/graders-openai-graders-configuration-2.jsonA JSON configuration object defining multiple grader types including text similarity and string checks for automated evaluation.Exact payloads, commands, or snippets shown in A JSON configuration object defining multiple grader types including text similarity and string checks for automated...

What This Skill Covers

  • Graders are a way to evaluate your model's performance against reference answers. Our graders API is a way to test your graders, experiment with results, and...
  • Main sections: Overview, Templating, Item namespace, Sample namespace, String check grader.

Workflow

  1. Open the most relevant file under docs/ for the exact documented workflow and wording.
  2. Open schemas/ files for exact structured contracts.
  3. Open examples/ files for concrete requests, commands, snippets, and manifests.
  4. Do not add behavior or configuration that is not present in the attached source files.

Canonical source: https://developers.openai.com/api/docs/guides/graders.md