OpenAI enables reinforcement fine-tuning of o4-mini model for enterprises

OpenAI has announced that third-party developers can now use reinforcement fine-tuning (RFT) for its o4-mini language reasoning model. As reported by Carl Franzen in VentureBeat, this capability allows enterprises to customize the model based on their specific needs and internal data. The fine-tuned models can be deployed through OpenAI’s API and connected to company systems. RFT works by using a feedback loop during training that adjusts model weights based on scores from a grader model. Several companies have already implemented RFT, including Accordance AI, which achieved a 39% improvement in tax analysis accuracy. The service is priced at $100 per hour of core training time. OpenAI notes that while RFT offers significant customization benefits, fine-tuned models may be more prone to jailbreaks and hallucinations.

OpenAI enables reinforcement fine-tuning of o4-mini model for enterprises

Related posts:

Stay up-to-date: