openai · OpenAI Platform Docs

Reinforcement fine-tuning

A guide on implementing reinforcement fine-tuning (RFT) for reasoning models by defining a programmable grader, preparing datasets, and managing the training lifecycle.

Import to Prompt Buddy

Derived skill

Files assembled from official documentation

Viewing SKILL.md

Reinforcement fine-tuning

A guide on implementing reinforcement fine-tuning (RFT) for reasoning models by defining a programmable grader, preparing datasets, and managing the training lifecycle.

When To Use

Use when you need to adapt a reasoning model to complex, domain-specific tasks using a programmable reward signal instead of fixed correct answers.

Reference Files

File	Contains	Use For
`SKILL.md`	Entry point: scope, routing table, and workflow.	Start here.
`docs/reinforcement-fine-tuning-workflow-guide.md`	A guide explaining how to adapt OpenAI reasoning models using custom feedback signals and grader definitions.	Questions about a guide explaining how to adapt OpenAI reasoning models using custom feedback signals and grader definitions.
`examples/reinforcement-fine-tuning-openai-reinforcement-fine-tuning.text`	A text-based example demonstrating the reinforcement fine-tuning process for OpenAI models.	Exact payloads, commands, or snippets shown in A text-based example demonstrating the reinforcement fine-tuning process for OpenAI models.
`examples/reinforcement-fine-tuning-openai-reinforcement-fine-tuning-dataset.json`	A JSON formatted dataset example containing compliant responses and explanations for reinforcement fine-tuning.	Exact payloads, commands, or snippets shown in A JSON formatted dataset example containing compliant responses and explanations for reinforcement fine-tuning.
`examples/reinforcement-fine-tuning-openai-reinforcement-fine-tuning-messages-form.json`	A JSON object demonstrating the message structure and field requirements for reinforcement fine-tuning datasets.	Exact payloads, commands, or snippets shown in A JSON object demonstrating the message structure and field requirements for reinforcement fine-tuning datasets.
`examples/reinforcement-fine-tuning-openai-reinforcement-fine-tuning-dataset-forma.text`	A text file containing a JSONL-formatted dataset of user messages, compliance labels, and explanations for reinforcement fine-tuning.	Exact payloads, commands, or snippets shown in A text file containing a JSONL-formatted dataset of user messages, compliance labels, and explanations for reinforcem...
`examples/reinforcement-fine-tuning-openai-reinforcement-fine-tuning-dataset-forma-2.text`	A text file containing a JSONL-formatted dataset for reinforcement fine-tuning featuring user messages, compliance status, and explanations.	Exact payloads, commands, or snippets shown in A text file containing a JSONL-formatted dataset for reinforcement fine-tuning featuring user messages, compliance st...
`examples/reinforcement-fine-tuning-openai-reinforcement-fine-tuning-jsonschema.json`	A JSON schema definition for a security assistant used in reinforcement fine-tuning examples.	Exact payloads, commands, or snippets shown in A JSON schema definition for a security assistant used in reinforcement fine-tuning examples.
`examples/reinforcement-fine-tuning-openai-reinforcement-fine-tuning-python-pydant.python`	A Python script demonstrating how to use to_strict_json_schema to convert a Pydantic model into a JSON schema for reinforcement fine-tuning.	Exact payloads, commands, or snippets shown in A Python script demonstrating how to use tostrictjsonschema to convert a Pydantic model into a JSON schema for reinfo...
`examples/reinforcement-fine-tuning-openai-reinforcement-fine-tuning-job-creation-.bash`	A curl command demonstrating how to create a reinforcement fine-tuning job using the OpenAI API.	Exact payloads, commands, or snippets shown in A curl command demonstrating how to create a reinforcement fine-tuning job using the OpenAI API.
`examples/reinforcement-fine-tuning-openai-reinforcement-fine-tuning-job-event-log.json`	A JSON object representing a real-time event log from a reinforcement fine-tuning job, including training metrics and reward scores.	Exact payloads, commands, or snippets shown in A JSON object representing a real-time event log from a reinforcement fine-tuning job, including training metrics and...

What This Skill Covers

Reinforcement fine-tuning (RFT) adapts an OpenAI reasoning model with a feedback signal you define. Like supervised fine-tuning, it tailors the model to your...
Main sections: Example: LLM-powered security review, Define a grader, Prepare your dataset, Upload your files, Create a fine-tune job.

Workflow

Open the most relevant file under docs/ for the exact documented workflow and wording.
Open schemas/ files for exact structured contracts.
Open examples/ files for concrete requests, commands, snippets, and manifests.
Do not add behavior or configuration that is not present in the attached source files.

Canonical source: https://developers.openai.com/api/docs/guides/reinforcement-fine-tuning.md

Skill metadata

Name: Reinforcement fine-tuning
Author: Bruno HANSS - Prompt Buddy
Generation mode: Ai Assisted Human Authored
Source count: 1

Provenance

Source program: OpenAI Platform Docs
Last generated: May 11, 2026
Last source sync: Unknown
Source pages: 1

Safety model

Canonical source pages are preserved separately. Derived files record source evidence and require zero AI-generated facts.

File tree

Source links

https://developers.openai.com/api/docs/guides/reinforcement-fine-tuning.md Back to skills