openai · OpenAI Platform Docs
Graders
Teaches how to implement and configure various grader types including string checks, text similarity, and model-based scoring to evaluate model performance against reference answers using JSON-based specifications and...
Derived skill
Files assembled from official documentation
Viewing SKILL.md
Graders
Teaches how to implement and configure various grader types including string checks, text similarity, and model-based scoring to evaluate model performance against reference answers using JSON-based specifications and...
When To Use
Use when you need to implement automated evaluation workflows to score model outputs using string matching, similarity metrics, or LLM-based grading.
Reference Files
| File | Contains | Use For |
|---|---|---|
SKILL.md | Entry point: scope, routing table, and workflow. | Start here. |
docs/graders-workflow-guide.md | A guide explaining how to use the Graders API to evaluate model performance using templating and specific item and sample namespaces. | Questions about a guide explaining how to use the Graders API to evaluate model performance using templating and specific item and sa... |
examples/graders-openai-graders.json | A JSON object demonstrating the structure of a grader configuration including reference answers and evaluation criteria. | Exact payloads, commands, or snippets shown in A JSON object demonstrating the structure of a grader configuration including reference answers and evaluation criteria. |
examples/graders-openai-graders-configuration.json | A JSON configuration object defining multi-type grader rules for validating function names and arguments against reference outputs. | Exact payloads, commands, or snippets shown in A JSON configuration object defining multi-type grader rules for validating function names and arguments against refe... |
examples/graders-openai-graders-json-definition.json | A JSON schema defining the structure for grader operations including string checks, comparison types, and reference inputs. | Exact payloads, commands, or snippets shown in A JSON schema defining the structure for grader operations including string checks, comparison types, and reference i... |
examples/graders-openai-graders-textsimilarity.json | A JSON schema defining the structure for text similarity grading operations including evaluation metrics and pass thresholds. | Exact payloads, commands, or snippets shown in A JSON schema defining the structure for text similarity grading operations including evaluation metrics and pass thr... |
examples/graders-openai-graders-scoremodel.json | A JSON schema defining the structure for a scoremodel object including input messages, model parameters, and pass thresholds for grading tasks. | Exact payloads, commands, or snippets shown in A JSON schema defining the structure for a scoremodel object including input messages, model parameters, and pass thr... |
examples/graders-openai-graders-2.json | A JSON object demonstrating the structure for defining grader roles including system, developer, user, and assistant messages. | Exact payloads, commands, or snippets shown in A JSON object demonstrating the structure for defining grader roles including system, developer, user, and assistant... |
examples/graders-openai-graders-python-scoremodel-implementation.python | A Python script demonstrating how to define a dummy scoremodel grader using the OpenAI API. | Exact payloads, commands, or snippets shown in A Python script demonstrating how to define a dummy scoremodel grader using the OpenAI API. |
examples/graders-openai-graders-3.json | A JSON object demonstrating the structure of a grader output including result scores and reasoning steps. | Exact payloads, commands, or snippets shown in A JSON object demonstrating the structure of a grader output including result scores and reasoning steps. |
examples/graders-openai-graders-4.json | A JSON object demonstrating the required schema for defining grader descriptions and conclusions. | Exact payloads, commands, or snippets shown in A JSON object demonstrating the required schema for defining grader descriptions and conclusions. |
examples/graders-openai-graders-comparison-logic.text | A text-based example demonstrating the comparison logic for grading multiple answers using the graders guide. | Exact payloads, commands, or snippets shown in A text-based example demonstrating the comparison logic for grading multiple answers using the graders guide. |
examples/graders-openai-graders-model-grader-comparison-logic.text | A text representation of the model_grader function logic used to compare multiple answers against a reference answer. | Exact payloads, commands, or snippets shown in A text representation of the modelgrader function logic used to compare multiple answers against a reference answer. |
examples/graders-openai-platform-docs-graders.json | A JSON representation of a grader configuration example used for evaluating model outputs. | Exact payloads, commands, or snippets shown in A JSON representation of a grader configuration example used for evaluating model outputs. |
examples/graders-openai-graders-python-implementation.python | A Python code example demonstrating how to implement a custom grading function for the OpenAI Graders feature. | Exact payloads, commands, or snippets shown in A Python code example demonstrating how to implement a custom grading function for the OpenAI Graders feature. |
examples/graders-openai-graders-5.json | A JSON object demonstrating the structure of grader outputs including text, JSON, tools, and audio fields. | Exact payloads, commands, or snippets shown in A JSON object demonstrating the structure of grader outputs including text, JSON, tools, and audio fields. |
examples/graders-openai-graders-6.json | A JSON object demonstrating the structure of a grader configuration including reference answers and evaluation keys. | Exact payloads, commands, or snippets shown in A JSON object demonstrating the structure of a grader configuration including reference answers and evaluation keys. |
examples/graders-openai-graders-python-implementation-2.python | A Python script demonstrating how to implement a grading function using the OpenAI platform to evaluate sample outputs against reference answers. | Exact payloads, commands, or snippets shown in A Python script demonstrating how to implement a grading function using the OpenAI platform to evaluate sample output... |
examples/graders-openai-graders-python-implementation-3.python | A Python code example demonstrating how to implement and use the Graders functionality within the OpenAI platform. | Exact payloads, commands, or snippets shown in A Python code example demonstrating how to implement and use the Graders functionality within the OpenAI platform. |
examples/graders-openai-platform-docs-graders-python-dependencies.text | A list of required Python library versions including numpy, scipy, and scikit-learn for implementing graders. | Exact payloads, commands, or snippets shown in A list of required Python library versions including numpy, scipy, and scikit-learn for implementing graders. |
examples/graders-openai-graders-punkt-wordnet.text | A text file containing sample data including punkt, stopwords, and wordnet metadata used for grading demonstrations. | Exact payloads, commands, or snippets shown in A text file containing sample data including punkt, stopwords, and wordnet metadata used for grading demonstrations. |
examples/graders-openai-platform-docs-graders-2.json | A JSON example demonstrating the structure and schema for using the Graders feature within the OpenAI platform. | Exact payloads, commands, or snippets shown in A JSON example demonstrating the structure and schema for using the Graders feature within the OpenAI platform. |
examples/graders-openai-graders-configuration-2.json | A JSON configuration object defining multiple grader types including text similarity and string checks for automated evaluation. | Exact payloads, commands, or snippets shown in A JSON configuration object defining multiple grader types including text similarity and string checks for automated... |
What This Skill Covers
- Graders are a way to evaluate your model's performance against reference answers. Our graders API is a way to test your graders, experiment with results, and...
- Main sections:
Overview,Templating,Item namespace,Sample namespace,String check grader.
Workflow
- Open the most relevant file under
docs/for the exact documented workflow and wording. - Open
schemas/files for exact structured contracts. - Open
examples/files for concrete requests, commands, snippets, and manifests. - Do not add behavior or configuration that is not present in the attached source files.
Canonical source: https://developers.openai.com/api/docs/guides/graders.md
