openai · OpenAI Platform Docs

Reinforcement fine-tuning | OpenAI API

Explains the process and methodology for performing reinforcement fine-tuning to optimize model performance based on specific feedback or reward signals.

Import to Prompt Buddy

Derived skill

Files assembled from official documentation

Viewing SKILL.md

Reinforcement fine-tuning | OpenAI API

Explains the process and methodology for performing reinforcement fine-tuning to optimize model performance based on specific feedback or reward signals.

When To Use

Use when you need to implement a reinforcement learning workflow to fine-tune a model using preference data or reward models.

Reference Files

What This Skill Covers

Reinforcement fine-tuning (RFT) adapts an OpenAI reasoning model with a feedback signal you define. Like supervised fine-tuning, it tailors the model to your...
Main sections: Example: LLM-powered security review, Define a grader, Grading Criteria:, Copernicus Product Security Policy, Introduction.

Workflow

Open the most relevant file under docs/ for the exact documented workflow and wording.
Open schemas/ files for exact structured contracts.
Open examples/ files for concrete requests, commands, snippets, and manifests.
Do not add behavior or configuration that is not present in the attached source files.

Canonical source: https://developers.openai.com/api/docs/guides/reinforcement-fine-tuning

Skill metadata

Name: Reinforcement fine-tuning | OpenAI API
Author: Bruno HANSS - Prompt Buddy
Generation mode: Ai Assisted Human Authored
Source count: 1

Provenance

Source program: OpenAI Platform Docs
Last generated: May 11, 2026
Last source sync: Unknown
Source pages: 1

Safety model

Canonical source pages are preserved separately. Derived files record source evidence and require zero AI-generated facts.

File tree

Source links

https://developers.openai.com/api/docs/guides/reinforcement-fine-tuning Back to skills

File	Contains	Use For
`SKILL.md`	Entry point: scope, routing table, and workflow.	Start here.
`docs/reinforcement-fine-tuning-openai-api-workflow-guide.md`	A guide explaining how to adapt OpenAI reasoning models using reinforcement fine-tuning with custom feedback signals and grading criteria.	Questions about a guide explaining how to adapt OpenAI reasoning models using reinforcement fine-tuning with custom feedback signals...
`examples/reinforcement-fine-tuning-openai-api-openai-reinforcement-fine-tuning-gu.text`	A text-based guide explaining the concepts and implementation of reinforcement fine-tuning using the OpenAI API.	Exact payloads, commands, or snippets shown in A text-based guide explaining the concepts and implementation of reinforcement fine-tuning using the OpenAI API.
`examples/reinforcement-fine-tuning-openai-api-openai-reinforcement-fine-tuning-tr.text`	A text file demonstrating the structured training data format for reinforcement fine-tuning including compliant and explanation fields.	Exact payloads, commands, or snippets shown in A text file demonstrating the structured training data format for reinforcement fine-tuning including compliant and e...
`examples/reinforcement-fine-tuning-openai-api-openai-reinforcement-fine-tuning-gr.text`	A text configuration defining a multi-type grader with a gpt-4o score model for reinforcement fine-tuning evaluation.	Exact payloads, commands, or snippets shown in A text configuration defining a multi-type grader with a gpt-4o score model for reinforcement fine-tuning evaluation.
`examples/reinforcement-fine-tuning-openai-api-openai-reinforcement-fine-tuning-ev.text`	A text-based list of evaluation criteria used to assess the accuracy of model-generated answers during reinforcement fine-tuning.	Exact payloads, commands, or snippets shown in A text-based list of evaluation criteria used to assess the accuracy of model-generated answers during reinforcement...
`examples/reinforcement-fine-tuning-openai-api-openai-reinforcement-fine-tuning-me.text`	A text representation of the messages format used for reinforcement fine-tuning training data.	Exact payloads, commands, or snippets shown in A text representation of the messages format used for reinforcement fine-tuning training data.
`examples/reinforcement-fine-tuning-openai-api-openai-reinforcement-fine-tuning-js.text`	A JSONL formatted dataset containing user messages, compliance labels, and explanations for reinforcement fine-tuning.	Exact payloads, commands, or snippets shown in A JSONL formatted dataset containing user messages, compliance labels, and explanations for reinforcement fine-tuning.
`examples/reinforcement-fine-tuning-openai-api-openai-reinforcement-fine-tuning-da.text`	A text file containing JSONL formatted training data examples for reinforcement fine-tuning, including user messages, compliance status, and explanations.	Exact payloads, commands, or snippets shown in A text file containing JSONL formatted training data examples for reinforcement fine-tuning, including user messages,...
`examples/reinforcement-fine-tuning-openai-api-openai-reinforcement-fine-tuning-js-2.text`	A JSON schema definition for a security assistant used in reinforcement fine-tuning examples.	Exact payloads, commands, or snippets shown in A JSON schema definition for a security assistant used in reinforcement fine-tuning examples.
`examples/reinforcement-fine-tuning-openai-api-openai-reinforcement-fine-tuning-py.text`	A Python code snippet demonstrating how to use tostrictjsonschema from openai.lib.pydantic to generate a compatible JSON schema for reinforcement fine-tuning.	Exact payloads, commands, or snippets shown in A Python code snippet demonstrating how to use tostrictjsonschema from openai.lib.pydantic to generate a compatible J...
`examples/reinforcement-fine-tuning-openai-api-openai-api-reinforcement-finetuning.text`	A curl command demonstrating how to create a reinforcement fine-tuning job via the OpenAI API.	Exact payloads, commands, or snippets shown in A curl command demonstrating how to create a reinforcement fine-tuning job via the OpenAI API.
`examples/reinforcement-fine-tuning-openai-api-openai-api-reinforcement-fine-tunin.text`	A curl command demonstrating how to send a request to the OpenAI API using a reinforcement fine-tuned model.	Exact payloads, commands, or snippets shown in A curl command demonstrating how to send a request to the OpenAI API using a reinforcement fine-tuned model.
`examples/reinforcement-fine-tuning-openai-api-openai-reinforcement-fine-tuning-jo.text`	A text representation of an OpenAI reinforcement fine-tuning job checkpoint object containing metrics and model checkpoint details.	Exact payloads, commands, or snippets shown in A text representation of an OpenAI reinforcement fine-tuning job checkpoint object containing metrics and model check...
`examples/reinforcement-fine-tuning-openai-api-openai-reinforcement-fine-tuning-jo-2.text`	A text log representing a JSON object of a reinforcement fine-tuning job event from the OpenAI API.	Exact payloads, commands, or snippets shown in A text log representing a JSON object of a reinforcement fine-tuning job event from the OpenAI API.