Multimodal Arena sees GPT-4o in the lead

The new “Multimodal Arena” from LMSYS compares the performance of different AI models on image-related tasks and shows that OpenAI’s GPT-4o leads the pack, closely followed by Claude 3.5 Sonnet and Gemini 1.5 Pro. Surprisingly, open source models such as LLaVA-v1.6-34B achieve results comparable to some proprietary models. The catch? Despite progress, Princeton’s CharXiv benchmark shows that AI still lags far behind human capabilities when it comes to complex tasks such as interpreting scientific graphs.

Related posts:

Stay up-to-date: