Midjourney, primarily known for AI image generation, has released new research in collaboration with New York University on training large language models to produce more creative text. Carl Franzen reports for VentureBeat that the research introduces two new techniques: Diversified Direct Preference Optimization (DDPO) and Diversified Odds Ratio Preference Optimization (DORPO). These methods encourage LLMs like Meta’s Llama and Mistral to generate more diverse outputs while maintaining coherence. The techniques modify existing preference optimization methods by incorporating a “deviation score” that rewards rare but high-quality responses during training. Tests showed the Llama-3.1-8B model with DDPO achieved the best balance of quality and diversity, producing more varied responses than GPT-4o while maintaining readability. This research could benefit creative applications including marketing, corporate storytelling, and entertainment content generation.