Nous Research has quietly launched Hermes 4, a family of AI language models that the company claims matches leading commercial systems while removing most content restrictions. Michael Nuñez reports about the release for VentureBeat.
Unlike ChatGPT or Claude, Hermes 4 responds to nearly any request without safety guardrails that have become standard in commercial AI systems. The model scored 57.1% on “RefusalBench,” a test measuring how often AI systems refuse to answer questions, significantly outperforming GPT-4o at 17.67% and Claude Sonnet 4 at 17%.
The startup’s largest 405-billion parameter model achieved 96.3% on the MATH-500 benchmark and 81.9% on the challenging AIME’24 mathematics competition. These scores rival proprietary systems that cost millions more to develop.
Hermes 4 introduces “hybrid reasoning,” allowing users to toggle between fast responses and deeper thinking processes. The system shows its internal reasoning before providing final answers, similar to OpenAI’s o1 models but with full transparency.
The models were trained using 192 Nvidia B200 GPUs and two novel systems: DataForge for synthetic data generation and Atropos for reinforcement learning. The training process required 71,616 GPU hours for the largest model.
Nous Research raised $65 million earlier this year and positions itself as an advocate for open-source AI without corporate content policies. The models are available through Hugging Face downloads and API access.